Skip to main content

Full text of "The Axiomatic Method"

See other formats


LJJ  < 

>  tf 

^  DQ 


OU   164189  >m 


OSMANIA  'UNIVERSITY  LIBRARY 
Call  No.  S'/tf-  £      H£(     ccession  N 


Tltle  9 

This  book  should  be  returned  on  or  before  the^late 
last  marked  below. 


THE  AXfOMATJC  METHOD 


STUDIES  IN  LOGIC 

AND 

THE  FOUNDATIONS  OF 
MATHEMATICS 


L.  E.  J.  BROUWER 

E.  W.  BETH 
A.  HEYTING 

Editors 


1959 


NORTH-HOLLAND  PUBLISHING  COMPANY 
AMSTERDAM 


THE  AXIOMATIC  METHOD 

WITH  SPECIAL  REFERENCE  TO  GEOMETRY 
AND  PHYSICS 

Proceedings  of  an  International  Symposium  held  at  the 
University  of  California,  Berkeley,  December  26,  1957  —  January  4,  1958 


Edited  by 

LEON  HENKIN 

Professor  of  Mathematics ,  University  of  California,  Berkeley 

PATRICK  SUPPES 

Associate  Profexxor  of  Philosophy,  Stanford  University 

ALFRED  TARSKT 

Profewor  of  Mathematics  and  Jtesearch  Professor,  University 
of  California,  Berkeley 


1S).r>9 


NOUTH-HOUiAN  I)  PU  I'.LISHING  COMPANY 
AMSTKUDAM 


No  part  of  this  book  may  be  reproduced 

in  any  form  by  print,  microfilm  or  any 

other  means  without  written  permission 

from  the  publisher 


PRINTED    IN    THK    NETHERLANDS 


CONTENTS 


PREFACE VII 


PART  I.  FOUNDATIONS  OF  GEOMETRY 

DIE  MANNIGFALTIGKEIT  DER  DIREKTIVEN  FUR  DIE  GKSTALTUNG  GEOMETRI- 
SCHER  AXIOMENSYSTEME.  Paul  Bernays  ...............  1 

WHAT  is  ELEMENTARY  GEOMETRY  ?  Alfred  Tarski    ............        16 

SOME  METAMATHEMATICAL  PROBLEMS  CONCERNING  ELEMENTARY  HYPERBOLIC 

GEOMETRY.  Wanda  Szmielew    ....................       30 

DIMENSION  IN  ELEMENTARY  EUCLIDEAN  GEOMETRY.  Dana  Scott  ......       53 

BINARY    RELATIONS    AS    PRIMITIVE    NOTIONS    IN    ELEMENTARY    GEOMETRY. 

Raphael  M.  Robinson    .......................        68 

REMARKS  ON  PRIMITIVE  NOTIONS  FOR  ELEMENTARY  EUCLIDEAN  AND  NON- 

KlTCLIDEAN  PLANE  GEOMETRY.  H.  L.  Royden     .............  86 

DIRECT  INTRODUCTION  OF  WEIERSTRASS  HOMOGENEOUS  COORDINATES  IN 
THE  HYPERBOLIC  PLANE,  ON  THE  BASIS  OF  THE  ENDCALCULUS  OF  HlLBERT. 
Paul  Szasz  ............................  97 


AXIOMATISCHER    AUFBAU     DER    EBENEN     ABSOLUTEN    GEOMETRIE. 

Bachmann  ............................  114 

NEW  METRIC  POSTULATES  FOR  ELLIPTIC  M-SPACE.  Leonard  M.  Blumenthal     .    .  127 

AXIOMS  FOR  GEODESICS  AND  THEIR  IMPLICATIONS.  Herbert  Busemann     .    .    .  146 

AXIOMS  FOR  INTUITIONISTIC  PLANE  AFFINE  GEOMETRY.  A.  Heyting      .....  160 

GRUNDLAGEN  DER  GEOMETRIE  VOM  STANDPUNKT  DER  ALLGEMEINEN  TOPO- 

LOGIE  AUS.  Karol  Borsuk  ......................  174 

LATTICE-THEORETIC  APPROACH  TO  PROJECTIVE  AND  AFFINE  GEOMETRY.  Bjarni 

J6nsson    .............................  188 

CONVENTIONALISM  IN  GEOMETRY.  Adolf  Griinbaum     ...........  204 


VI  CONTENTS 


PART  II.  FOUNDATIONS  OF  PHYSICS 

How  MUCH  RIGOR  is  POSSIBLE  IN  PHYSICS  ?  Percy  W.  Bridgman 225 

LA  FINITUDE  EN  MECANIQUE  CLASSIQUE,  SES  AXIOMES  ET  LEURS  IMPLICATIONS. 

Alexandra  Froda 238 

THE  FOUNDATIONS   OF  RIGID   BODY  MECHANICS  AND   THE  DERIVATION   OF  ITS 

LAWS  FROM  THOSE  OF  PARTICLE  MECHANICS.  ErilCSt  W.  Adams 250 

THE   FOUNDATIONS    OF   CLASSICAL    MECHANICS    IN    THE   LIGHT   OF    RECENT   AD- 
VANCES IN  CONTINUUM  MECHANICS.  Walter  Noll 266 

ZUR  AXIOMATISIERUNG  DER  MECHANiK.  Hans  Hermes 282 

AXIOMS  FOR   RELATIVISTIC  KINEMATICS  WITH  OR  WITHOUT   PARITY.   Patrick 

Suppes 29 1 

AXIOMS  FOR  COSMOLOGY.  A.  G.  Walker 308 

AXIOMATIC  METHOD  AND  THEORY  OF  RELATIVITY.  EQUIVALENT  OBSERVERS  AND 

SPECIAL  PRINCIPLE  OF  RELATIVITY.  Yoshio  Ueno 322 

ON  THE  FOUNDATIONS  OF  QUANTUM  MECHANICS.  Herman  Rubin 333 

THE  MATHEMATICAL  MEANING  OF  OPERATIONALISM  IN    QUANTUM    MECHANICS. 

I.E.  Segal 341 

QUANTUM  THEORY  FROM  NON-QUANTAL  POSTULATES.  Alfred  Lande   ....  353 

QUANTENLOGIK  UND  DAS  KOMMUTATiVE  GissETZ.  Pascual  Jordan 365 

LOGICAL  STRUCTURE  OF  PHYSICAL  THEORIES.  Paulette  F6vrier 376 

PHYSICO-LOGICAL  PROBLEMS.  J.  L.  Dcstouches 390 


PART  III.  GENERAL  PROBLEMS  AND  APPLICATIONS 
OF  THE  AXIOMATIC  METHOD 

STUDIES  IN  THE  FOUNDATIONS  OF  GENETICS.  J.  H.  Woodger 408 

AXIOMATIZING   A   SCIENTIFIC   SYSTEM    BY   AXIOMS   IN   THE    FORM   OF   IDENTIFI- 
CATIONS. R.  B.  Braithwaite 429 

DEFINABLE  TERMS  AND  PRIMITIVES  IN  AXIOM  SYSTEMS.  Herbert  A.  Simon    .    .     443 
AN  AXIOMATIC  THEORY  OF  FUNCTIONS  AND  FLUENTS.  Karl  Menger 454 

AXIOMATICS  AND  THE  DEVELOPMENT  OF  CREATIVE  TALENT.  R.  L.  Wilder      .     .       474 


PREFACE 

The  thirty-three  papers  in  this  volume  constitute  the  proceedings  of 
an  international  symposium  on  The  axiomatic  method,  with  special  reference  to 
geometry  and  physics.  This  symposium  was  held  on  the  Berkeley  campus 
of  the  University  of  California  during  the  period  from  December  26,  1957 
to  January  4,  1958. 

The  volume  naturally  divides  into  three  parts.  Part  I  consists  of  fourteen 
papers  on  the  foundations  of  geometry,  Part  II  of  fourteen  papers 
on  the  foundations  of  physics,  and  Part  III  of  five  papers  on  general 
problems  and  applications  of  the  axiomatic  method.  General  differ- 
ences between  the  character  of  the  papers  in  Part  I  and  those  in 
Part  II  reflect  the  relative  state  of  development  of  the  axiomatic  method 
in  geometry  and  in  physics.  Indeed,  one  of  the  important  aims  of  the 
symposium  was  precisely  to  confront  two  disciplines  in  which  the  pattern 
of  axiomatic  development  has  been  so  markedly  different. 

Geometry,  as  is  well  known,  is  the  science  in  which,  more  than  2300 
years  ago,  the  axiomatic  method  originated.  Work  on  the  axiomatization 
of  geometry  was  greatly  stimulated,  and  our  conception  of  the  significance 
and  scope  of  the  axiomatic  method  itself  was  greatly  expanded,  through 
the  construction  of  non-Euclidean  geometries  in  the  first  half  of  the 
nineteenth  century.  By  the  turn  of  the  century  we  find  for  the  first  time, 
in  the  works  of  men  like  Pasch,  Peano,  Pieri,  and  Hilbert,  axiomatic 
treatments  of  geometry  which  both  are  complete  and  meet  the  exacting 
standards  of  contemporary  methodology  of  the  deductive  sciences.  Since 
that  time  there  has  been  a  continuous  and  accelerating  development  of  the 
subject,  so  that  at  present,  all  of  the  important  geometric  theories 
have  been  axiomatized,  and  new  theories  have  been  created  through 
changes  introduced  into  various  systems  of  axioms;  for  most  theories 
a  variety  of  axiom  systems  is  available  conforming  to  varying  ideals 
which  have  been  pursued  in  connection  with  the  axiomatization  of 
geometry.  Most  recently,  building  upon  the  work  on  axiomatization, 
it  has  become  possible  to  formalize  geometrical  theories,  and  in 
consequence  geometrical  theories  themselves  have  been  made  the 
object  of  exact  investigation  by  metamathematical  methods,  leading 
to  several  new  kinds  of  results.  The  present  volume  contains  new  contri- 


VIII  PREFACE 

butions  to  many  of  the  directions  in  which  studies  in  the  foundations 
of  geometry  have  been  developing. 

Axiomatic  work  in  the  foundation  of  physics  has  had  a  more  checkered 
history.  Newton's  Principia,  first  published  in  1687,  emulated  Euclid's 
Elements,  but  the  eighteenth  and  nineteenth  centuries  did  not  witness  a 
development  of  axiomatic  methods  in  physics  at  all  comparable  to  that 
in  geometry.  Even  the  work  in  this  century  on  axiomatizing  various 
branches  of  physics  has  been  relatively  slight  in  comparison  with  the 
massive  mathematical  development  of  geometry.  There  is  not  to  our 
knowledge  a  single  treatise  on  classical  mechanics  which  compares  in 
axiomatic  precision  with  such  a  work  as  Hilbert's  well-known  text  on  the 
foundations  of  geometry;  furthermore,  the  axiomatic  treatments  of 
various  branches  of  physics  which  have  been  attempted,  including  those 
presented  in  this  volume,  do  not  yet  have  the  finished  and  complete 
character  typical  of  geometrical  axiomatizations.  Much  foundational 
work  in  physics  is  still  of  the  programmatic  sort,  and  it  is  possible  to 
maintain  that  the  status  of  axiomatic  investigations  in  physics  is  not 
yet  past  the  preliminary  stage  of  philosophical  discussion  expressing 
doubt  as  to  its  purpose  and  usefulness.  In  spite  of  such  doubts,  an  in- 
creasing effort  is  being  made  to  apply  axiomatic  methods  in  physics,  and 
many  of  the  papers  in  Part  II  indicate  how  exact  mathematical  methods 
may  be  brought  to  bear  on  problems  in  the  foundations  of  physics.  To  the 
knowledge  of  the  Editors  the  papers  in  Part  II  constitute  the  first 
collection  whose  aim  is  specifically  to  provide  an  over-all  perspective  of 
the  application  of  axiomatic  methods  in  physics.  It  is  our  candid  hope 
that  this  book  will  be  a  stimulus  to  further  work  in  this  important  domain. 

An  attempt  has  been  made  to  give  coherence  to  the  volume  by  grouping 
papers  according  to  their  subject.  Part  I  begins  with  a  paper  by  Bernays 
which  surveys  the  main  tendencies  manifesting  themselves  in  the  con- 
struction of  geometrical  axiom  systems.  In  the  five  papers  which  follow, 
metamathematical  notions  referring  to  the  axiomatic  foundations  of 
various  systems  of  Euclidean  and  non-Euclidean  geometry  are  discussed, 
and  to  a  large  extent  specific  metamathematical  methods  are  applied. 
The  first  three  papers,  namely  those  by  Tarski,  Szmielew,  and  Scott,  are 
concerned  with  problems  of  completeness  and  decidability,  while  the 
remaining  two,  those  by  Robinson  and  Royden,  deal  with  problems  of 
definability.  The  next  five  papers  set  forth  new  axiomatizations  of  various 
branches  of  geometry.  Szasz  is  concerned  with  hyperbolic  geometry, 
Bachmann  with  absolute  geometry,  Blumenthal  with  elliptic  geometry, 


PREFACE  IX 

Busemann  with  metric  differential  geometry,  and  Heyting  with  affine 
geometry;  the  last  of  these  authors  approaches  the  subject  from  the 
intuitionistic  point  of  view.  In  the  following  two  papers  connections 
between  the  foundations  of  geometry  and  some  related  branches  of 
mathematics  are  studied.  In  particular,  Borsuk  examines  Euclidean 
geometry  from  the  standpoint  of  topology,  and  Jonsson  surveys  pro- 
jective  and  affine  geometry  from  the  standpoint  of  lattice  theory.  The 
last  paper  of  Part  I,  that  of  Griinbaum,  deals  with  the  philosophical 
problem  of  conventionalism  in  geometry. 

In  the  case  of  Part  II,  the  order  has,  roughly  speaking,  followed  the 
historical  development  of  physics.  The  opening  paper  of  Bridgman 
analyzes  the  general  notion  of  rigor  in  physics.  It  is  followed  by  four 
papers  on  the  axiomatic  foundations  of  classical  mechanics.  Froda 
considers  particle  mechanics,  Adams  rigid  body  mechanics  and  Noll 
continuum  mechanics;  Hermes  analyzes  certain  axiomatic  problems 
surrounding  the  notion  of  mass.  Three  papers  on  relativity  follow.  Suppes 
deals  with  relativistic  kinematics,  Walker  with  relativistic  cosmology 
and  Ueno  with  relativity  theory  as  based  on  the  concept  of  equivalent 
observers.  Next  come  three  papers  on  quantum  mechanics.  Rubin 
considers  quantum  mechanics  from  the  standpoint  of  the  theory  of 
stochastic  processes;  Segal  examines  the  mathematical  meaning  of 
operational  ism  in  quantum  mechanics;  and  Lande  approaches  the  subject 
on  the  basis  of  non-quantal  postulates.  Finally,  there  are  three  papers 
which  deal  with  relations  between  logic  and  physics.  Jordan  considers 
quantum  logic  and  the  commutative  law;  Fcvrier  the  logical  structure 
of  physical  theories ;  and  Dest ouches  the  theory  of  prediction  with  special 
reference  to  physico-logical  problems. 

The  arrangement  of  papers  in  Part  III  is  somewhat  arbitrary.  Loosely 
speaking,  the  papers  move  from  more  specific  to  more  general  topics. 
Woodger  is  concerned  with  the  foundations  of  genetics ;  Braithwaite  with 
scientific  theories  whose  axioms  take  the  form  of  identities;  Simon  with 
primitive  and  definable  terms  in  axiom  systems ;  Menger  with  the  general 
theory  of  functions  in  the  context  of  the  empirical  sciences ;  and  Wilder 
with  the  potentiality  of  the  axiomatic  approach  as  a  method  of  teaching. 

It  goes  without  saying  that  each  author  is  solely  responsible  for  the 
content  of  his  paper.  The  Editors  have  confined  themselves  to  arranging 
the  volume  and  handling  various  technical  matters  relating  to  publication. 
In  particular,  the  choice  of  notation  and  symbolism  has  been  left  to  the 
individual  author. 


X  PREFACE 

The  calendar  of  the  scientific  sessions  was  as  follows: 

December  26,  afternoon.  Opening  remarks  by  Acting  Chancellor  James 
D.  Hart  of  the  University  of  California,  Berkeley,  and  by  Professor 
Alfred  Tarski  of  the  same  University.  Section  I,  Professor  Paul  Bernays 
(Zurich,  Switzerland).  Section  II,  Professor  P.  W.  Bridgman  (Cambridge, 
Massachusetts,  U.S.A.). 

December  27,  morning.  Section  II,  Professor  Hans  Hermes  (Minister, 
Germany),  Professor  Walter  Noll  (Pittsburgh,  Pennsylvania,  U.S.A.). 

December  27,  afternoon.  Section  I,  Professor  Friedrich  Bachmann 
(Kiel,  Germany),  Professor  Alfred  Tarski  (Berkeley,  California,  U.S.A.). 

December  28,  morning.  Section  II,  Professor  Ernest  Adams  (Berkeley, 
California,  U.S.A.),  Professor  Yoshio  Ueno  (Hiroshima,  Japan). 

December  28,  afternoon.  Section  I,  Mr.  Dana  Scott  (Princeton,  New 
Jersey,  U.S.A.),  Professor  H.  L.  Royden  (Stanford,  California,  U.S.A.), 
Professor  Raphael  M.  Robinson  (Berkeley,  California,  U.S.A.). 

December  30,  morning.  Section  II,  Professor  Arthur  G.  Walker 
(Liverpool,  England),  Professor  Patrick  Suppes  (Stanford,  California, 
U.S.A.). 

December  30,  afternoon.  Commemorative  talks  on  the  first  anniversary 
of  the  death  of  Heinrich  Scholz  by  Alfred  Tarski  and  Paul  Bernays. 
Section  I,  Professor  Paul  Szasz  (Budapest,  Hungary),  Professor  Wanda 
Szmielew  (Warsaw,  Poland,  and  Berkeley,  California,  U.  S.A.). 

December  31,  morning.  Section  II,  Professor  Irving  E.  Segal  (Chicago, 
Illinois,  U.S.A.),  Professor  Jean-Louis  Destouches  (Paris,  France). 

December  31,  afternoon.  Section  I,  Professor  Leonard  M.  Blumenthal 
(Columbia,  Missouri,  U.S.A.),  Professor  Herbert  Busemann  (Los  Angeles, 
California,  U.S.A.). 

January  2,  morning.  Section  III,  Professor  Joseph  H.  Woodger 
(London,  England),  Professor  Richard  Braithwaite  (Cambridge,  England). 

January  2,  afternoon.  Section  I,  Professor  Karol  Borsuk  (Warsaw, 
Poland),  Professor  Bjarni  Jonsson  (Minneapolis,  Minnesota,  U.S.A.). 

January  3,  morning.  Section  II,  Professor  Pascual  Jordan  (Hamburg, 
Germany),  Dr.  Paulette  Fevricr  (Paris,  France). 

January  3,  afternoon.  Section  I,  Proiessor  Arend  Heyting  (Amsterdam, 
Netherlands),  Professor  Adolf  Grunbaum  (Bethlehem,  Pennsylvania, 
U.S.A.). 

January  4,  morning.  Section  II,  Professor  Alfred  Lande  (Columbus, 
Ohio,  U.S.A.),  Professor  Herman  Rubin  (Eugene,  Oregon,  U.S.A.). 

January  4,  afternoon.  Section  III,  Professor  Karl  Menger  (Chicago, 


PREFACE  XI 

Illinois,  U.S.A.),  Professor  Raymond  L.  Wilder  (Ann  Arbor,  Michigan, 
U.S.A.). 

Three  invited  speakers  whose  papers  are  included  in  this  volume  were 
unable  actually  to  attend  the  symposium:  the  paper  of  Paul  Szasz  was 
read  by  Steven  Orey,  and  the  papers  of  Alexandre  Froda  and  Herbert 
Simon  were  presented  by  title.  Several  talks  were  presented  originally 
under  different  titles  than  appear  in  this  volume:  R.  B.  Braithwaite, 
Necessity  and  contingency  in  the  empirical  interpretation  of  axiomatic 
systems',  A.  Lande,  Non-quantal  foundations  of  quantum  mechanics]  H.  L. 
Royden,  Binary  relations  as  primitive  notions  in  geometry  with  set-theoretical 
basis',  A.  G.  Walker,  Axioms  of  kinematical  relativity. 

This  symposium  was  jointly  sponsored  by  the  U.  S.  National  Science 
Foundation  (which  contributed  the  bulk  of  the  supporting  funds),  the 
International  Union  for  the  History  and  Philosophy  of  Science  (Division 
of  Logic,  Methodology,  and  Philosophy  of  Science),  and  the  University 
of  California.  The  symposium  was  organized  by  a  committee  consisting 
of  Leon  Henkin,  Secretary  (University  of  California,  Berkeley),  Victor  F. 
Lenzen  (University  of  California,  Berkeley),  Benson  Mates  (University  of 
California,  Berkeley),  Ernest  Nagel  (Columbia  University,  New  York), 
Steven  Orey  (University  of  California,  Berkeley,  and  University  of 
Minnesota,  Minneapolis),  Julia  Robinson  (Berkeley,  California),  Patrick 
Suppes  (Stanford  University,  Stanford,  California),  Alfred  Tarski, 
Chairman  (University  of  California,  Berkeley),  and  Raymond  L.  Wilder 
(University  of  Michigan,  Ann  Arbor).  The  Secretary  of  the  symposium 
was  Dorothy  Wolfe. 

We  gratefully  acknowledge  the  help  of  Mr.  Rudolf  Grewe  and  Dr. 
Dana  Scott  in  preparing  this  volume  for  publication. 

University  of  California,  Berkeley  THE  EDITORS 

Stanford  University 
February  1959 


Symposium  on  the  Axiomatic  Method 


DIE  MANNIGFALTIGKEIT  DER  DIREKTIVEN  FttR  DIE 
GESTALTUNG  GEOMETRISCHER  AXIOMENSYSTEME 

PAUL  BERNAYS 

Eidgendssische   Technische  Hochschule,  Zurich,   Schweiz 

Bei  der  Betrachtung  der  Axiomatisierungen  der  Geometric  stehen  wir 
unter  dem  Eindruck  der  grossen  Mannigfaltigkeit  der  Gesichtspunkte, 
unter  denen  die  Axiomatisierung  erfolgen  kann  und  auch  schon  erfolgte. 
Die  urspriingliche  einfache  alte  Vorstellung,  wonach  man  schlechtweg 
von  den  Axiomen  der  Geometric  sprechen  kann,  ist  nicht  nur  durch  die 
Entdeckung  der  nicht euklidischen  Geometrien  verdrangt,  und  ferner 
auch  durch  die  Einsicht  in  die  Moglichkeit  verschiedener  Axiomatisierun- 
gen einer  und  derselben  Geometric,  sondern  es  sind  iiberhaupt  wesentlich 
verschiedene  methodische  Gesichtspunkte  aufgetreten,  unter  denen  man 
die  Axiomatisierung  der  Geometric  unternommen  hat  und  deren  Zielset- 
zungen  sogar  in  gewissen  Beziehungen  antagonistisch  sind. 

Der  Keim  fur  diese  Mannigfaltigkeit  ist  bereits  in  der  euklidischen 
Axiomatik  zu  finden.  Fur  deren  Gestaltung  war  der  Umstand  bestim- 
mend,  dass  man  hier  an  Hand  der  Geometric  zum  ersten  Mai  auf  die 
Problemstellung  der  Axiomatik  gefiihrt  wurde.  Die  Geometric  ist  hier 
sozusagen  die  Mathematik  schlechthin.  Das  Verhaltnis  zur  Zahlentheorie 
ist  methodisch  wohl  kein  vollig  deutliches.  In  gewissen  Teilen  wird  ein 
Stuck  Zahlentheorie  mit  Verwendung  der  auschaulichen  Zahlvorstcllung 
entwickelt.  Ferner  wird  in  der  Proportionenlehre  inhaltlich  von  dem 
Zahlbcgriff  Gebrauch  gcmacht,  sogar  mit  einem  impliziten  Einschluss  des 
Tertium  non  datur ;  allerdings  scheint  es,  dass  man  dessen  voile  Verwen- 
dung zu  vermeiden  trachtete. 

Wahrend  die  methodische  Sonderstellung  des  Zahlbcgriffes  hier  nicht 
explicite  hervortritt,  wird  der  Grossenbegriff  ausdriicklich  als  inhaltliches 
Hilfsmittel  an  die  Spitze  gestellt,  in  einer  Art  \ibri gens,  die  wir  heute  nicht 
mehr  konzedieren  konnen,  indem  namlich  von  verschiedenen  Gegen- 
standlichkeiten  als  selbstverstandlich  vorausgesetzt  wird,  dass  sie  Grossen- 
charakter  haben.  Der  Grossenbegriff  wird  freilich  auch  der  Axiomatisie- 
rung unterworfen;  die  diesbeziiglichen  Axiome  werden  jedoch  ausdriick- 
lich als  vorgangige  (KOIVO.I  evvoial)  von  den  ubrigen  Axiomen  abgesondert. 

1 


2  PAUL  BERNAYS 

Diese  Axiome  sind  von  ahnlicher  Art,  wie  diejenigen,  die  man  heute  fur 
die  abelschen  Gruppen  aufstellt.  Was  aber  auf  Grund  des  damaligen  me- 
thodischen  Standpunktes  unterblieb,  war,  dass  nicht  axiomatisch  fixiert 
wurde,  welche  Gegenstande  als  Grossen  anzusehen  seien. 

Umsomehr  ist  es  zu  bewundern,  dass  man  damals  schon  auf  das  Be- 
sondere  derjenigen  Voraussetzung  aufmerksam  wurde,  durch  welche  die 
archimedischen  Grossen,  wie  wir  sie  heute  nennen,  ausgezeichnet  werden. 
Das  Archimedische  (Eudoxische)  Axiom  wird  dann,  in  der  an  die  Griechen 
anschliessenden  mittelalterlichen  Tradition,  insbesondere  in  den  Unter- 
suchungen  der  Araber  uber  das  Parallelenaxiom  wesentlich  benutzt.  Auch 
bei  dem  Beweis  von  Saccheri  zur  Ausschliesung  der  ,,Hypothese  des 
stumpfen  Winkels"  tritt  es  als  wesentlich  auf.  In  der  Tat  ist  ja  diese 
Ausschliessung  ohne  das  Archimedische  Axiom  nicht  moglich,  da  ja  eine 
nicht-archimedische,  schwach-spharische  (bzw.  schwach-elliptische)  Geo- 
metric mit  den  Axiomen  der  euklidischen  Geometric,  abgesehen  vom 
Parallelenaxiom,  im  Einklang  steht. 

Bei  alien  diesen  Untersuchungen  tritt  das  zweite  Stetigkeitsaxiom,  wel- 
ches im  spateren  19.ten  Jahrhundert  formuliert  wurde,  noch  nicht  auf. 
Es  konnte  bei  den  Beweisfiihrungen,  fur  die  es  in  Betracht  kam  —  wie  bei 
den  Flacheninhalts-  und  Langenbestimmungen  —  auf  Grund  der  er- 
wahnten  Verwendung  des  Grossenbegriffs,  entbehrt  werden,  wonach  es 
z.B.  als  selbstverstandlich  gait,  dass  die  Kreisflache  sowie  der  Kreisum- 
fang  eine  bestimmte  Grosse  besitzen.  An  die  Stelle  der  altcn  Grossenlehrc 
trat  zum  Beginn  der  Neuzeit  als  beherrschende  iibergeordneteDisziplin  die 
Grossenlehre  der  Analysis,  die  sich  formal  und  dem  Inhalt  nach  sehr 
reich  entwickelte,  noch  ehe  sie  zu  methodischer  Deutlichkeit  gelangte. 

Freilich,  bei  der  Entdeckung  der  nichteuklidischen  Geometric  spiel te 
die  Analysis  zunachst  keine  erhebliche  Rolle,  wohl  aber  wird  sie  domi- 
nierend  in  den  nachfolgenden  Untersuchungen  von  Riemann  und  Helm- 
holtz,  und  spater  von  Lie,  zur  Kennzeichnung  der  drei  ausgezeichnet  en 
Geometrien  durch  gewisse  sehr  allgemeine,  analytisch  fassbare  Bedingun- 
gen.  Charakteristisch  fur  diese  Behandlung  der  Geometric  ist  insbeson- 
dere, dass  man  nicht  nur  die  einzelnen  Raumgcbilde,  sondern  auch  di,e 
Raummannigfaltigkeit  selbst  zum  Gegenstand  nimmt.  In  der  Moglich- 
keit  der  Durchfiihrung  einer  solchen  Betrachtung  zeigten  sich  die  ge- 
waltigen  begrifflichen  und  formalen  Mittel,  welche  die  Mathematik  in  der 
Zwischenzeit  gewonnen  hatte;  und  in  der  Anlage  der  Problemstellung 
ausserte  sich  die  begrifflich-spekulative  Richtung,  welche  die  Mathematik 
im  Laufe  des  19.ten  Jahrhunderts  einschlug. 


GESTALTUNG  GEOMETRISCHER  AXIOMENS YSTEME  3 

Die  differentialgeometrische  Behandlung  der  Grundlagen  der  Geo- 
metric ist  ja  iibrigens  bis  in  die  neueste  Zeit  durch  Hermann  Weyl  sowie 
Elie  Cartan  und  Levi-Civita,  in  Ankniipfung  an  die  allgemeine  Relativi- 
tatstheorie  Einsteins,  weiter  entwickelt  worden.  So  imponierend  und  ele- 
gant das  in  dieser  Hinsicht  Erreichte  ist,  so  haben  sich  doch  die  Mathe- 
matiker  vom  grundlagentheoretischen  Standpunkt  damit  nicht  zufrieden 
gegeben.  Zunachst  suchte  man  sich  von  der  fur  die  differentialgeometri- 
sche Methode  wesentlichen  Voraussetzung  der  Differenzierbarkeit  der 
Abbildungsfunktionen  zu  befreien.  Dafiir  bedurfte  es  der  Ausbildung  der 
Methoden  einer  allgemeinen  Topologie,  welche  um  die  Wende  des  Jahr- 
hunderts  begann  und  seitdem  eine  so  imposante  Entwicklung  .genommen 
hat.  Weitergehend  trachtete  man  sich  von  der  Voraussetzung  des  archi- 
medischen  Charakters  der  geometrischen  Grossen  iiberhaupt  unabhangig 
zu  machen. 

Diese  Tendenz  steht  im  Zeichen  derjenigen  Entwicklung,  mit  welcher 
die  Analysis  ihre  vorher  beherrschende  Stellung  in  gewissem  Masse  ein- 
gebiisst  hat.  Dieses  neue  Stadium  in  der  mathematischen  Forschung 
kniipfte  sich  an  die  Auswirkung  der  schon  erwahnten  begrifflich-speku- 
lativen  Richtung  der  Mathematik  im  19.tenjahrhundert,  wie  sie  insbeson- 
dere  in  der  Schopfung  der  allgemeinen  Mengenlehre,  in  der  scharferen  Be- 
griindung  der  Analysis,  in  der  Konstitution  der  mathematischen  Logik 
und  in  der  neuen  Fassung  der  Axiomatik  in  Erscheinung  trat. 

Fur  dieses  neue  Stadium  war  zugleich  charakteristisch,  dass  man  wieder 
mehr  auf  die  Methoden  der  alten  griechischen  Axiomatik  zuruckkam,  wie 
es  wiedcrholt  in  den  Epochen  geschah,  in  denen  man  auf  begriffliche 
Prazision  starkeren  Nachdruck  legte.  In  Hilberts  Grundlagen  der  Geo- 
metric finden  wir  einerseits  dieses  Zuriickkommen  auf  die  alte  elementare 
Axiomatik,  freilich  in  grundsatzlich  veranderter  methodischer  Auffas- 
sung,  andererseits  als  ein  hauptsachliches  Thema  die  moglichst  weit- 
gehende  Ausschaltung  des  archimedischen  Axioms:  sowohl  bei  der  Pro- 
portionenlehre  wie  beim  Flacheninhaltsbegriff  sowie  in  der  Begriindung 
der  Streckenrechnung.  Diese  Art  der  Axiomatisierung  hatte  iibrigens  fiir 
Hilbert  nicht  den  Sinn  der  Ausschliesslichkeit ;  er  hat  ja  bald  danach  eine 
andere  Art  der  Begriindung  daneben  gestellt,  mit  der  zum  erst  en  Mai  das 
vorhin  erwahnte  Programm  einer  topologischen  Grundlegung  aufgestellt 
und  durchgefiihrt  wurde. 

Etwa  gleichzeitig  mit  Hilberts  Grundlegung  wurde  auch  in  der  Schule 
von  Peano  und  Pieri  die  Axiomatisierung  der  Geometrie  gepflegt.  Bald 
folgten  auch  die  axiomatischen  Untersuchungen  von  Veblen  und  R.  L. 


4  PAUL  BERNAYS 

Moore;  und  es  waren  nunmehr  die  Forschungsrichtungen  eingeschlagen, 
in  denen  sich  auch  heute  die  Beschaftigung  mit  den  Grundlagen  der  Geo- 
metric weiterbewegt.  Als  kennzeichnend  hierfur  haben  wir  eine  Vielheit 
der  methodischen  Richtungen. 

Die  eine  ist  die,  welche  die  Mannigfaltigkeit  der  kongruenten  Transfor- 
mationen  durch  moglichst  allgemeine  und  pragnante  Bedingungen  zu 
kennzeichnen  sucht,  die  zweite,  diejenige,  welche  die  projektive  Struktur 
des  Raumes  voranstellt  und  das  Metrische  auf  das  Projektive  mit  der 
von  Cayley  und  Klein  ausgebildeten  Methode  der  projektiven  Mass- 
bestimmung  zuruckzufiihren  trachtet,  und  die  dritte  die,  welche  auf  eine 
elementare  Axiomatisierung  der  vollen  Kongruenzgeometrie  ausgeht. 

Verschiedene  wesentlich  neue  Gesichtspunkte  sind  in  der  Entwicklung 
dieser  Richtungen  hinzugetreten.  Einmal  erhielt  die  projektive  Axioma- 
tik  eine  verstarkte  Systematisierung  mittels  der  Verbandstheorie.  Ferner 
wurde  man  gewahr,  dass  man  bei  der  Kennzeichnung  der  Gruppe  der 
kongruenten  Transformationen  die  mengentheoretischen  und  funktionen- 
theoretischen  Begriffbildungen  zurlicktreten  lassen  kann,  indem  man  die 
Transformationen  durch  sie  bestimmende  Gebilde  festlegt.  Damit  kommt 
das  Verfahren  dem  der  element aren  Axiomatik  nahe,  da  die  Gruppen- 
beziehungen  sich  nun  als  Beziehungen  zwischen  geometrischen  Gebilden 
darstcllen. 

Ich  will  aber  hier  nicht  naher  von  diesen  beiden  Forschungsrichtungen 
der  geometrischen  Axiomatik  sprechen,  fur  die  ja  hier  authentischere 
Vertreter  anwesend  sind,  auch  nicht  von  den  Erfolgen,  die  mit  Verwen- 
dung  topologischer  Methoden  erzielt  worden  sind,  woriiber  insbesondere 
neuere  Abhandlungen  von  Freudenthal  einen  Uberblick  liefern,  sondern 
mich  den  Fragen  der  an  drittcn  Stelle  genannten  Richtung  der  Axioma- 
tisierung zuwenden. 

Selbst  innerhalb  dieser  Richtung  finden  wir  wiederum  eine  Mannig- 
faltigkeit von  moglichen  Zielsetzungen.  Man  kann  einerseits  darauf  aus- 
gehen,  mit  moglichst  wenigen  Grundelementen,  etwa  nur  einem  Grund- 
pradikat  und  einer  Gattung  von  Individuen,  auszukommen.  Anderer- 
seits  kann  man  vornehmlich  darauf  gerichtet  sein,  natiirliche  Absonde- 
rungen  von  Teilen  der  Axiomatik  hervortreten  zu  lassen.  Diese  Gesichts- 
punkte fiihren  zu  verschiedenen  Alternativen. 

So  wird  einerseits  durch  die  Betrachtung  der  nichteuklidischen  Geo- 
metric die  Voranstellung  der  ,,absoluten"  Geometric  nahegelegt.  Anderer- 
seits  hat  auch  ein  solcher  Aufbau  manches  fur  sich,  bei  dem  die  affine 
Vektorgeometrie  vorangestellt  wird,  wie  es  am  Anfang  von  Weyl's 


GESTALTUNG  GEOMETRISCHER  AXIOMENS YSTEME  5 

,,Raum,  Zeit,  Materie"  geschieht.  Diesen  beiden  Gesichtspunkten  kann 
man  schwcrlich  zugleich  in  eincr  Axiomatik  Geniige  tun.  Ein  anderes  Bei- 
spiel  ist  dieses.  Bei  der  Voranstellung  der  Axiome  der  Inzidenz  und  An- 
ordnung  ist  es  eine  mogliche  und  elegante  begriffliche  Reduktion,  dass 
man,  nach  dem  Vorgang  von  Veblen,  den  Begriff  der  Kollinearitat  auf 
den  Zwischen-Begriff  zuruckfiihrt.  Andererseits  ist  es  fur  manche  t)ber- 
legungen  von  Wichtigkeit,  die  von  dem  Anordnungsbegriff  unabhangigen 
Folgerungen  der  Inzidenzaxiome  abzusondern;  so  ist  es  ja  wiinschenswert 
die  Begrundung  der  Streckenrechnung  aus  den  Inzidenzaxiomen  als  un- 
abhangig  von  den  Anordnungsaxiomen  zu  erkennen.  Wiederum  bei  der 
Theorie  der  Anordnung  selbst  hat  man  Ersparungen  von  Axiomen  der 
linearen  Anordnung  durch  Anwendung  des  Axioms  von  Pasch  als  mog- 
lich  erkannt ;  andererseits  ist  in  gewisser  Hinsicht  eine  Anlage  der  Axiome 
zu  bevorzugcn,  bei  welcher  die  fur  die  lineare  Anordnung  kennzeichnenden 
Axiome  abgesondert  werden. 

Mit  diesen  Beispielen  von  Alternativen  ist  die  Mannigfaltigkeit  in  den 
moglichen  und  auch  den  tatsachlich  verfolgten  Zielsetzungen  nicht  an- 
nahernd  erschopft.  So  ist  es  ein  moglicher  und  sinngemasser,  wenn  auch 
nicht  obligatorischer  regulativer  Gesichtspunkt,  dass  die  Axiome  so  for- 
muliert  werden  sollen,  dass  sie  sich  jeweils  nur  auf  ein  beschranktes 
Raumstuck  beziehen.  Dieser  Gedanke  ist  implicite  ja  wohl  schon  in  der 
euklidischen  Axiomatik  mitbestimmend ;  und  es  mag  auch  sein,  dass  der 
Anstoss,  den  man  so  fruhzeitig  an  dem  Parallelenaxiom  genommen  hat, 
gerade  darauf  beruht,  dass  in  der  euklidischen  Formulierung  der 
Begriff  der  geniigend  weiten  Verlangerung  auftritt.  Die  erstmalige 
explizite  Durchfuhrung  des  genannten  Programmpunktes  geschah 
durch  Moritz  Pasch,  und  es  kniipfte  sich  daran  die  Einfiihrung  idealer 
Elemente  mit  Hilfe  von  Schnittpunktsatzen,  eine  seitdem  in  erfolg- 
reicher  Weise  ausgestaltete  Methode  der  Begrundung  der  projektiven 
Geometric. 

Eine  andere  Art  der  moglichen  zusatzlichen  Aufgabestellung  ist  die- 
jenige,  die  Unscharfe  unseres  bildhaften  Vorstellens  begrifflich  nachzu- 
ahmen,  wie  dieses  ja  Hjelmslev  getan  hat.  Das  ergibt  freilich  nicht  nur 
eine  andere  Art  der  Axiomatisierung,  sondern  iiberhaupt  ein  abweichendes 
Beziehungssystem,  ein  Verfahren,  welches  wohl  wegen  seiner  Komplika- 
tion  nicht  viel  Anklang  gefunden  hat.  Doch  auch  ohne  in  dieser  Richtung 
sich  soweit  von  dem  Ublichen  zu  entfernen,  kann  man  etwas  in  gewisser 
Hinsicht  Ahnliches  anstreben,  indem  man  den  Begriff  des  Punktes  als 
Gattungsbegriff  vermeidet,  wie  es  ja  in  verschiedenen  interessanten 


6  PAUL  BERNAYS 

nuereen  Axiomatisierungen  geschieht,  so  insbesondere  in  derjenigen  von 
Huntington. 

In  solcher  Weise  zeigt  sich  auf  mannigfachste  Art,  dass  es  kein  eindeu- 
tigcs  Optimum  fur  die  Gestaltung  eines  geometrischen  Axiomensystems 
gibt.  Was  iibrigens  die  Reduktionen  in  Hinsicht  der  Grundbegriffe  und 
der  Dingarten  betrifft,  so  ist  ungeachtet  des  grundsatzlichen  Int cresses, 
welches  jede  solche  Reduktionsmoglichkeit  hat,  doch  immer  daran  zu 
erinnern,  dass  die  tatsachliche  Anwendung  einer  solchen  Reduktion  sich 
nur  dann  empfiehlt,  wenn  damit  eine  iibersichtliche  Gestaltung  des 
Axiomensystems  erreicht  wird. 

Es  lassen  sich  immerhin  gewisse  Direktiven  fur  Reduktionen  nennen, 
die  wir  generell  akzeptieren  konnen.  Nehmen  wir  etwa  als  Beispiel  die 
Hilbert'sche  Fassung  der  Axiomatik.  Bei  dieser  werden  einerseits  die 
Geraden  als  eine  Dinggattung  genommen,  andererseits  die  Halbstrahlen 
als  Punktmengen  eingefiihrt  und  anschliessend  dann  die  Winkel  als  ge- 
ordnete  Paare  zweier  von  einem  Punkt  ausgehender  Halbstrahlen,  also 
als  Paare  von  Mengen,  erklart.  Hier  sind  tatsachlich  Moglichkeiten  der 
vereinfachenden  Reduktion  gegeben.  Man  mag  verschiedener  Meinung 
dariiber  sein,  ob  man  anstatt  der  verschiedenen  Gattungen  ,, Punkt, 
Gerade,  Ebene"  nur  eine  Gattung  der  Punkte  zugrunde  legen  will,  wobei 
dann  anstelle  der  Inzidenzbeziehung  die  Beziehungen  der  Kollinearitat 
und  der  Komplanaritat  von  Punkten  treten.  In  der  verbandstheoreti- 
schen  Behandlung  werden  ja  die  Geraden  und  Ebenen  gleichstehend  mit 
den  Punkten  als  Dinge  genommen.  Hier  steht  man  wiederum  vor  einer 
Alternative.  Hingegen  die  Halbstrahlen  als  Punktmengen  einzufiihren, 
iiberschreitet  jedenfals  den  Rahmen  der  elementaren  Geometrie  und  ist 
auch  fur  diese  nicht  notig.  Generell  konnen  wir  es  wohl  als  Direktive 
nehmen,  dass  hohere  Gattungen  nicht  ohne  Erfordernis  eingefiihrt  werden 
sollen.  Beim  Fall  der  Winkeldefinition  kann  man  das  dadurch  vermeiden, 
dass  man  die  Winkelaussagen  auf  Aussagen  iiber  Punkttripel  reduziert, 
wie  dieses  ja  von  R.  L.  Moore  durchgefiihrt  wurde.  Hier  wird  sogar  noch 
eine  weitere  Reduktion  erreicht,  indem  iiberhaupt  die  Winkelkongruenz 
mit  Hilfe  der  Streckenkongruenz  erklart  wird,  doch  findet  hierbei  wieder- 
um auch  eine  gewisse  Einbusse  statt.  Namlich  die  Beweisfiihrungen 
stiitzen  sich  dabei  wesentlich  auf  die  Kongruenz  von  ungleichsinnig  zu- 
geordneten  Dreiecken.  Daher  ist  diese  Art  der  Axiomatisierung  nicht 
geeignet  fur  den  Problemkreis  derjenigen  Hilbert'schen  Untersuchungen, 
welche  sich  auf  das  Verhaltnis  der  gleichsinnigen  Kongruenz  zur  Symme- 
tric beziehen.  Diese  Bemerkung  betrifft  freilich  auch  die  meisten  der 


GESTALTUNG  GEOMETRISCHER  AXIOMEN  S  YSTEME  7 

Axiomatisierungen,  bei  denen  der  Begriff  der  Spiegelungen  an  der  Spitze 
steht. 

Neben  den  allgemeinen  Gesichtspunkten  mochte  ich  als  etwas  Einzelnes 
eine  spezielle  Moglichkeit  der  Anlage  eines  elementaren  Axiomensystems 
erwahnen,  namlich  eine  solche  Axiomatik,  bei  welcher  der  Begriff  ,,das 
Punktetripel  a,  b,  c  bildet  bei  b  einen  rechten  Winke!"  als  einzige  Grund- 
beziehung  und  die  Punkte  als  einzige  Grundgattung  genommen  werden, 
ein  Programm,  auf  welches  neuerdings  durch  eine  Arbeit  von  Dana  Scott 
hingewiesen  worden  ist.  Die  genannte  Beziehung  geniigt  der  von  Tarski 
festgestellten  notwendigen  Bedingung  fur  ein  allein  ausreichendes  Grund- 
pradikat  der  Planimetrie.  Im  Vergleich  mit  dem  fur  eine  Axiomatik  sol- 
cher  Art  vorbildlich  gewordenen  Verfahren  Pieri's,  der  ja  in  einer  Axioma- 
tisierung  die  Beziehung  ,,b  und  c  haben  von  a  gleichen  Abstand"  als 
Grundbegriff  nahm,  scheint  hier  insofern  eine  Erleichterung  zu  bestehen, 
als  der  Begriff  der  Kollinearitat  von  Punkten  sich  enger  an  den  des  rech- 
ten Winkels  als  an  den  Pieri'schen  Grundbegriff  anschliesst.  Was  freilich 
den  Kongruenzbegriff  anbelangt,  so  scheint  sich  fur  die  Axiome  der  Kon- 
gruenz  aus  der  betrachteten  Reduktion  keine  Vereinfachung  zu  ergeben. 
Ubrigens  ist  diese  Axiomatisierung  ebenso  wie  die  genannte  Pieri'sche 
eine  von  denen,  die  keine  Aussonderung  der  gleichsinnigen  Kongruenz 
liefern  l. 

Fur  eine  elementare  Axiomatisierung  der  Geometric  stellt  sich  als  be- 
sondere  Frage  die  der  Gewinnung  einer  Vollstandigkeit  im  Sinne  der 
Kategorizitat.  Diese  wird  bei  den  meisten  Axiom ensystem en  durch  die 
Stetigkeitsaxiome  erwirkt.  Die  Einfiihrung  dieser  Axiome  bedeutet  aber, 
wie  man  weiss,  eine  Uberschreitung  des  Rahmens  der  gewohnlichen  Pra- 
dikatenlogik,  indem  das  archimedische  Axiom  den  allgemeinen  Zahl- 
begriff  verwendet  und  das  zweite  Stetigkeitsaxiom  den  allgemeinen  Pra- 
dikaten-  oder  Mengenbegriff .  Wir  haben  seither  aus  den  Untersuchungen 
Tarski's  gelernt,  dass  wir  eine  Vollstandigkeit,  wenigstens  im  deduktiven 
Sinne,  in  einem  elementaren  Rahmen  erreichen  konnen,  wobei  das  Be- 
merkenswerte  ist,  dass  das  Schnittaxiom  in  einer  gewissen  Formalisierung 
erhalten  bleibt,  wahrend  von  dem  Archimedischen  Axiom  abgesehen  wird. 
Das  Archimedische  Axiom  fallt  ja  insofern  formal  aus  dem  sonstigen 
Rahmen  heraus,  als  es  in  logischer  Formalisierung  die  Gestalt  einer  un- 


l)  Einige  Angaben  iiber  die  Definitionen  der  Inzidenz-,  Anordnungs-  und  Kon- 
gruenzbegriffe  aus  dem  Begriff  des  rechten  Winkels,  sowie  iiber  einen  Teil  des 
Axiomensystems  folgen  in  einem  Anhaiig. 


8  PAUL  BERNAYS 

endlichen  Alternative  hat,  wahrend  das  Schnitt axiom  auf  Grund  seiner 
Form  der  Allgemeinheit  sich  durch  ein  Axiomenschema  darstellen  und 
dadurch  in  seiner  Anwendung  dem  jeweiligen  formalen  Rahmen  anpassen 
lasst,  —  wobei  dann  fiir  den  elementaren  Rahmen  der  Pradikatenlogik 
die  Beweisbarkeit  des  Archimedischen  Axioms  aus  dem  Schnittaxiom 
verloren  geht.  Freilich  hat  eine  solche  Beschrankung  auf  einen  pradikaten- 
logischen  Rahmen  zur  Folge,  dass  verschiedene  Uberlegungen  nur  meta- 
theoretisch  ausgefuhrt  werden  konnen,  wie  z.B.  der  Beweis  des  Satzes, 
dass  ein  einfach  geschlossenes  Polygon  die  Ebene  zerlegt,  und  ebenso  die 
Betrachtung  iiber  Erganzungsgleichheit  und  Zerlegungsgleichheit  von 
Polygonen.  Man  steht  hier  wieder  einmal  vor  einer  Alternative,  namlich 
der,  ob  man  den  Gesichtspunkt  der  Elementaritat  des  logischen  Rahmens 
voranstellen  will,  oder  sich  hinsichtlich  des  logischen  Rahmens  nicht  be- 
schrankt,  wobei  ja  iibrigens  noch  verschiedene  Abstufungen  in  Betracht 
kommen. 

In  Bezug  auf  die  Anwendung  einer  Logik  der  zweiten  Stufe  sei  hier  nur 
daran  erinnert,  dass  eine  solche  sich  ja  im  Rahmen  der  axiomatischen 
Mengenlehre  in  solcher  Weise  prazisieren  lasst,  dass  keine  fiihlbare  Ein- 
schrankung  der  Beweismethoden  erfolgt.  Auch  das  Skolem'sche  Para- 
doxon  bereitet  im  Falle  der  Geometrie  insofern  keine  eigentlichc  Ver- 
legenheit,  als  man  es  dadurch  ausschalten  kann,  dass  man  in  den  model- 
theoretischen  Betrachtungen  den  Mengenbegriff ,  der  in  einem  der  hoheren 
Axiome  auftritt,  mit  dem  Mengenbegriff  der  Modelltheorie  gleichsetzt. 

Zum  Schluss  mochte  ich  hervorheben,  dass  der  in  meinen  Ausfiihrungen 
betonte  Umstand,  dass  es  in  der  Gestaltung  der  Axiomatik  kein  eindeuti- 
ges  Optimum  gibt,  keineswegs  bedeutet,  dass  die  Erzeugnisse  der  geome- 
trischen  Axiomatik  notwendig  den  Charakter  des  Unvollkommenen  und 
Fragmentarischen  tragen.  Sie  wissen,  dass  auf  diesem  Gebiete  etliche 
Gestaltungen  von  grosser  Vollkommenheit  und  Abrundung  erreicht  wor- 
den  sind.  Gerade  die  Vielheit  der  moglichen  Zielrichtungen  bewirkt,  dass 
durch  das  Neuere  das  Friihere  im  allgemeinen  nicht  schlechtweg  iiberholt 
wird,  wahrend  andererseits  auch  jede  erreichte  Vollkommenheit  immer 
noch  Platz  lasst  fiir  weitere  Aufgaben. 

ANHANG.  Bemerkungen  zu  der  Aufgabe  einer  Axiomatisierung  der  eukli- 
dischen  Planimetrie  mit  der  einzigen  Grundbeziehung  R(a,  b,  c) :  ,,das 
Punktetripel  a,  b,  c,  bildet  bei  b  einen  rechten  Winkel".  Die  Axiomatisierung 
gelingt  insoweit  auf  einfache  Art,  als  nur  die  Beziehungen  der  Kollineari- 
tat  und  des  Parallelismus  betrachtet  werden.  Fiir  die  Theorie  der  Kolli- 


GESTALTUNG  GEOMETRISCHER  AXIOME  NS  YSTEME  9 

nearitat  geniigen  die  f  olgenden  Axiome  : 

Al    -^R(atb,a) 

A2   R(a,  b,  c)  ->  R(c,  b,  a)  &  -,/?(«,  c,  b)  * 

A3    R(a,  b,  c)  &  R(a,  b,  d)  &  R(e,  b,  c)  ->  #(*,  6,  d) 

A4   /?(«,  6,  c)  &  R(a,  b,d)&c+d&  R(e,  c,  6)  ->  #(*,  c,  rf) 

A5    a  +  b  -+(Ex)R(a,b,x). 

Dazu  tritt  die  Definition  der  Beziehung  Koll(#,  b,  c)  :  ,,die  Punkte  a, 
b,  c  sind  kollinear"  : 

DEFINITION  1.  Koll(a,  b,  c)  «->  (x)(R(x,  a,  b)  ~*R(x,  a,  c))  v  a  =  c. 
Es  sind  dann  die  f  olgenden  Satze  beweisbar: 

(1)  Koll(fl,  b,c)<-*a  =  bva  =  cvb  =  cv  (Ex)(R(xt  a,  b)  &  R(x,  a,  c)) 

(2)  Koll(fl,  b,  c)  ->  Roll  (a,  c,  b)  &  Roll  (b,  a,  c) 

(3)  Koll(«,  6,  c)  &  Koll(fl,  b,  d)  &  a  +  b  ->  Koll(6,  c,  ^) 

(4)  R(a,  b,  c)  &  Roll  (b,  c,d)&b=td->  R(a,  6,  d) 

(5)  /?(«,  6,  c)  ->  -,  Koll(«,  6,  c) 

(6)  #(«,  6,  c)  &  R(at  6,  rf)  ->  Koll(6,  c,  ^) 

(7)  R(at  b,  c)  &  R(a,  6,  d)  ->  -./?(«,  c,  <J). 

Zum  Beweis:  Koll(c,  ^,  b)  &  c  4=  6  ->  (/?(«,  c,  rf)  ->  R(a,  c,  b)) 

(8)  /e(«,  b,  c)  &  tf  (a,  6,  d)  &  #K  c,  c)  &  R(at  e,d)  -+c  =  dv  b  =  e. 
Zum  Beweis  :  Koll(ft,  c,  rf)  &  Koll(^f  c,d)&c=£d-+  Koll(6,  c,  e) 

,  5)  &  6  4=  «  &  7?(a,  6,  c)  -*  /?(«,  6,  «) 
,  ft)  &  6  =4=  ^  &  7?(a,  e,  c)  ->  7?(a,  «,  6) 
R(a,b,e)  -+-&(<&,  e,b). 


Fiir  die  Theorie  des  Parallelismus  nehmen  wir  zwei  weitere  Axiome 
hinzu  : 

A6    04=&&a=M->  (Ex)(R(xt  a,  b)  &  R(x,  a,  c))  v 

(Ex)(R(a,  x,  b)  &  R(a,  x,  c))  v  R(a,  b,  c)  v  R(a,  c,  b) 


2)  Durch  dieses  Axiom  wird  bereits  die  elliptische  Geometrie   ausgeschlossen. 


10  PAUL  BERNAYS 

Das  Axiom  besagt  in  iiblicher  Ausdrucksweise,  dass  man  von  einem 
Punkte  a  ausserhalb  einer  Geraden  be  auf  diese  eine  Senkrechte  fallen 
kann.  Die  eindeutige  Bestimmtheit  der  Senkrechten  in  Abhangigkeit  von 
dem  Punkt  a  und  der  Geraden  be  ergibt  sich  mit  Hilfe  von  (4)  und  (8). 

A7    R(a,  b,  c)  &  R(b,  c,  d)  &  R(c,  d,  a)  ->  R(d,  a,  b) 

Dieses  ist   eine  Form  des  euklidischen  Parallelenaxioms  im  engeren, 
winkelmetrischen  Sinn. 

Die  Parallelitat  wird  nun  definiert  durch  : 

DEFINITION  2.  Par(a,  b,c,d)  «-*a=4=6&c=M&  (Ex)(Ey)(R(a,  x,  y)  & 

R(b,  x,  y)  &  R(c,  y,  x)  &  R(d,  y,  x)) 

Als  beweisbare  Satze  ergeben  sich  : 
(9)    Par  (0,  b  ;  c,  d)  ->  Par(6,  a;c,d)&  Par(c,  d\a,b) 
(10)    Pzr(a,b;c,d)  ^a+c&a^d&b^c&b+d 
(1  1)    Par(a,  b;  cfd)^a^=b&c^d&:  (Ex)(Eu)((R(a,  x,  u)  v  x  =  a)  & 
&  (R(b,  x,  «)  v  x  =  b)  &  (R(x,  u,c)vu=;c)&  (R(x,  u,d)yu  =  d)) 

Fur  den  Beweis  der  Implikation  von  rechts  nach  links  hat  man  zu  zei- 
gen,  dass  auf  einer  Geraden  a,  b,  mindestens  fiinf  verschiedene  Punkte 
liegen,  was  mit  Hilfe  der  Axiome  A1-A6  gelingt. 

(  1  2)    Par(a,  b  ;  c,  d)  ->  (x)  ((R(a,  x,  c)  v  x  =  a)  &  (R(b,  x,  c)  v 


(  1  3)    Par  (a,  b;c,d)&  Koll(a,  b,  e)  &  b  4=  e  ->  Par  (b,  e\c,d) 

und  daraus  insbesondere 

(  1  4)    Par(0,  b\ctd)  ->  -iKoll(a,  b,  c)  ; 

ferner 

(15)  Par(a,  b;c,d)&  Koll(a,  b,  e)  -*  -,Koll(c,  d,  e) 

(16)  -,Koll(a,  6,  c)  ->  (£*)Par(a,  6;  c,  x) 

(17)  Par  (a,  6;  c,  d)  &  Par(a,  6;  c,  e)  ->  Koll(c,  d,  e) 

(18)  Par(fl,6;c,rf)&Par(«,  6;«,/)  -> 

->  Par(c,  d\e,f)v  (Koll(^,  c,  <J)  & 
An  den  Begriff  des  Parallelismus  kniipft  sich  noch  der  der  Vektor- 


GESTALTUNG  GEOMETRISCHER  AXIOMENS YSTEME  11 

gleichheit:  ,,a,  b  und  c,  d  sind  die  Gegenseiten  eines  Parallelogramms" : 
DFINITION  3.  Pag(a  b'tc,d)<-+  Par(0,  b,c',d)&  Par(a,  c ;  b,  d) 
Man  kann  hiermit  beweisen: 

(19)  Pag(a,  b\  c,  d)  ->  Pag(c,  d;a,b)&  Pag(a,  c;  M) 

(20)  Pag(a,  6;  c,  d)  &  Pag(a,  6;  c,  «)  ->  ^  =  e 

(21)  Pag(a,  6;  c,  d)  ->  -,Koll(a,  6,  c). 
Fur  den  Beweis  des  Existenzsatzes 

(22)  -,Koll(fl,  6,  c)  ->  (Ex)(Pag(a,  b\c,x) 

bedarf  es  noch  eines  weiteren  Axioms: 

A8    R(at  b,  c)  ->  (Ex)(R(at  c,  x)  &  R(c,  b,  x)). 

Mil  Hilfe  dieses  Axioms  ist  generell  beweisbar,  dass  zwei  verschiedene, 
nicht  parallele  Geraden  einen  Schnittpunkt  besitzen: 

(23)  -iKoll(a,  b,  c)  &  -nPar(«,  b ;  c,  d)  -> 

->  (Ex)(KoU(a,  b,  x)  &  Koll(c,  d,  x)).  - 

Ob  sich  im  Ganzen  eine  iibersichtliche  Axiomatik  mil  dem  Grund- 
begriff  R  erreichen  lasst,  bleibe  dahingestellt.  Wir  begniigen  uns  hier 
damit,  Definitionen  fur  die  wesent lichen  weiteren  Begriffe  aufzustellen. 
Fur  diese  lasst  sich  immerhin  eine  gewisse  Ubersichtlichkeit  erreichen. 

An  die  Figur  des  Parallelogramms  kniipfen  sich  die  folgenden  zwei 
verschiedenen  Definitionen  der  Beziehung  ,,a  ist  Mittelpunkt  der  Strecke 
b,c": 

DEFINITION  4i   Mpi(a\  b,  c)  <->  (Ex) (Ey) (Pag(£,  x\  y,  c)  & 

&  Ko\l(a,  b,  c)  &  Koll(a,  x,  y)) 
DEFINITION  42  Mp^(a\  b,  c)  *->  (Ex)(Ey)(Pag(x,  y,  a,b)  &  Pag(^,  y;  c,  a)). 

Im  Sinne  der  zweiten  Definition  kann  man  die  Moglichkeit  der  Ver- 
doppelung  einer  Strecke  beweisen: 

(24)  a  4=  b  ->  (Eu)Mp2(a't  b,  u). 

Die  Existenz  des  Mittelpunktes  einer  Strecke  im  Sinne  der  Df .  4i,  d.  h. 

(25)  b  4=  c  ->  (Eu)Mpi(u;  b,  c)t 

lasst  sich  beweisen,  wenn  man  noch  das  Axiom  hinzunimmt: 


12  PAUL   BERNAYS 

A9      Par  (a,  b',c,d)&  Par  (a,  c;  b,  d)  ->  -nPar(a,  d;  6,  c). 
(Im  Parallelogramm  schneiden  sich  die  Diagonalen) 

Durch  Spezialisierung  der  zur  Definition  von  Mp\  gehorigen  Figur 
erhalten  wir  eine  Definition  der  Beziehung  ,,a,  b,  c  bilden  ein  gleich- 
schenkliges  Dreieck  mit  der  Spitze  in  a": 

DEFINITION  5i.    Ish(a\  b,  c)  <->  (Eu)(Ev)(Pa.g(a,  b;  c,  v)  &  R(a,  u,  b)  & 

&R(a,u,c)  &R(b,u,v)). 

Mit  Hilfe  von  Mp\  und  Ist\  konnen  wir  den  Pieri'schen  Grundbegriff : 
,,a  hat  von  b  und  c  gleichen  Abstand"  definieren: 

DEFINITION  6.    7si(a;  b,  c)  <->  b  =  c  v  Mp\(a\  b,  c)  v  Ist\(a\  b,  c). 

Eine  andere  Art  der  Definition  des  Begriffes  7s  beruht  auf  der  Verwen- 
dung  der  Symmetrie.  Hierzu  dient  folgender  Hilfsbegriff :  ,,a,  b,  c,  d,  e 
bilden  ein  ,,normales"  Quintupel": 

DEFINITION  7.    Qn(a,  b,  c,  d,  e)  «->  R(a,  c,  b)  &  R(a,  d,  b)  & 

&  R(a,  e,  c)  &  R(a,  e,  d)  &  7^(6.  e,c)&c3=d. 

Mit  Hilfe  von  Qn  erhalten  wir  eine  weitere  Art  der  Definition  fiir  Mp 
und  1st: 

(43.    Mp*(a\  b,  c)  <->  (Ex)(Ey)Qn(x,  y,  b,  c,  a) 
DEFINITION-! 

|52.    Ist2(a;  b,  c)  *->  (Ex)(Ey)Qn(a,  x,  b,  c,  y), 

aus  denen  sich  7$2  entsprechend  wie  Isi  definieren  lasst. 

Ferner  schliesst  sich  hieran  noch  die  Definition  der  Spiegelbildlichkeit 
von  Punkten  a,  b  in  Bezug  auf  eine  Gerade  c  d: 

DEFINITION  8.    Sym(«,  b\c,d)  <->  c  4=  d  &  (Ex)(Ey)(Ez)(KoYL(x,  c,  d)  & 

&  Koll(y,  c,  d)  &  Qn(x,  y,  a,  bt  z)).— 

Fiir  die  Definition  der  Streckenkongruenz  brauchen  wir  schliesslich 
noch  den  Begriff  der  gleichsinnigen  Kongruenz  auf  einer  Geraden:  ,,die 
Strecken  a  b  und  c  d  sind  kollinear,  kongruent  und  gleichgerichtet" : 
DEFINITION  9i.  Lgi(a,b;c,  d)  <->  Koll  (a,  b,  c)  & 

&(£*)(£y)(Pag(a,  x\  b,  y)  &  Pzg(c,x;d,  y)), 
oder  auch: 

DEFINITION  92.     Lgz(a,  b;  c,  d)    <->  Koll(a,  b,  c)  &  a  ^  b  & 

&  (Ex)(Mp(x\  b,  c)  &  Mp(x\  a,  d))  v  (a  =  d  &  Mp(a\  6,  c)) 

v  (b  =  c  &  Mp(b',a,d)), 


GESTALTUNG  GEOMETRISCHER  AXIOMENSYSTEME  13 

(wobei  fur  Mp  eine  der  drei  obigen  Definitionen  genommen  werden  kann. 
Nunmehr  kann  im  Ganzen  (mit  jeder  der  beiden  Definitionen  von  Lg) 
die  Streckenkongruenz  definiert  werden: 

DEFINITION  10.    Kg(a,  b\c,d)  <-»  Lg(a,  b\  c,  d)  v  Lg(a,  b\  d,  c)  v 

v  (a  =  b  &  Isi(a;  b,  d))  v  (Ex)(P*g(a,  b\  c,  x)  &  Isi(c;  x,  d)). 

Durch  eine  Definition  analog  derjenigen  von  Lg2  kann  man  auch  die 
Kongruenz  von  Winkeln  mit  gleichem  Scheitelpunkt  als  sechsstellige 
Beziehung  einfiihren,  nachdem  man  vorher  den  Begriff  der  Winkel- 
halbierenden  eingefiihrt  hat:  ,,d(=\=  a)  liegt  auf  der  Halbierenden  des 
Winkels  b  a  c": 

DEFINITION  11.    Wh(0,  d;b,c)  «-»  -iKoll(a,  b,  c)  & 

&  (Ex)(Ey)(Ez)(Ko\\(a,  c,  x)  &  Koll(a,  d,  y)  &  Qn(a,  y,  b,  x,  z)). 

In  Anbetracht  des  sehr  zusammengesetzten  Charakters  dieser  Kon- 
gruenzbeziehung  Kg  wird  man  in  der  Axiomatisierung  die  Gesetze  iiber 
Kg  auf  solche  der  als  Bestandteile  des  definierenden  Ausdrucks  auftre- 
tenden  Begriff e  zuruckfuhren.  Dabei  bestehen  auf  Grund  der  Mehrheit 
der  Definitionen  von  Mp,  1st,  Is  Alternativen  in  Hinsicht  darauf ,  ob  man 
in  starkerem  Masse  die  Beziehungen  des  Parallelismus  oder  die  der  Sym- 
metric heranzieht.  Auf  jeden  Fall  diirfte  das  Axiom  der  Vektorgeometrie 

A 10    Pag(«,  b\  p,  q)  &  Pag(6,  c\q,r)  ->  Pag(0,  c;p,r)v 

v  (Koll(«,  c,  p)  &  Koll(a,  c,  r)) 

oder  ein  gleichwertiges  zweckmassig  sein.  Im  Ganzen  konnte  man  sich 
hierbei  als  Ziel  setzen,  das  in  der  eulkidischen  Planimetrie  vorliegende 
Zusammenspiel  von  Parallelismus  und  Spiegelung  auf  eine  moglichst 
symmetrische  Art  zur  Darstellung  zu  bringen. 

Was  endlich  die  Zwischenbeziehung  betrifft,  so  ist  die  Figur  fur  die 
Definition  der  Beziehung  ,,a  liegt  zwischen  b  und  c"  schon  als  Bestandteil 
in  derjenigen  von  Qn  enthalten.  Namlich  wir  konnen  definieren: 

DEFINITION  12.    Zw(a;  6,  c)  <->  (Ex}(R(by  a,  x)  &  R(ct  a,  x)  &  R(b,  x,  c)). 
Fur  diesen  Begriff  sind  zunachst  beweisbar: 

(26)  -nZw(fl;6,  b) 

(27)  Zw(«;6,c)  ->Zw(«;c,  b) 


14  PAUL  BERNAYS 

(28)  Zw(«;6,  c)  ->  Koll(a,  b,c) 

und  ferner  mit  Benutzung  von  A5,  A6  und  A8 

(29)  a  +  b  ->  (Ex)Zw(x;  a,  6)  &  (Ex)Zw(b;  a,  x). 

Fur  die  Gewinnung  der  weiteren  Eigenschaften  des  Zwischenbegriffes 
konnen  die  f  olgenden  Axiome  dienen  : 

Al  1    R(a,  b,  c)  &  R(a,  b,  d)  &  R(c,  a,  d)  &  R(e,  c,  b)  ->  -^R(bt  e,  d) 
A12   R(a,  b,  d)  &  R(d,  b,  c)  &  a  =J=  c  ->  Zw(a;  b,  c)  v  Zw(6;  a,  c)  v 

v  Zw(c  ;  a,  6) 

A13    Zw(a;  6,  c)  &  Zw(6;  a,  d)  ->Zw(0;  c,  <J) 
A14    7?(a,  6,  rf)  &  R(d,  b,  c)  &  R(a,  c,  e)  &  Zw(d;  a,  e)  ->  Zw(6;  a,  c) 

Aus  diesem  Axiom  kann  man  in  einigen  Schritten  den  allgemeineren 
Satz  gewinnen  : 


(30)  Zw(b\  a,  c)  &  Koll(«,  d,  e)  &  Par(6,  d\  c,  e) 
Dieses  gelingt  mit  Verwendung  des  Satzes 

(31)  R(a,  b,  e)  &  R(e,  b,  c)  &  R(b,  a,  d)  &  R(b,  c,  /)  &  7^(6,  e,  d)  & 

&  «(6,  e,  /)  &  Zw(6;  a,  c)  ->  Zw(^;  4,  /), 

welcher  sich  aus  dcm  vorhin  erwahnten  Axiom  A10  ablcitcn  lasst. 
Mit  Hilfe  von  (30)  und  dem  Axiom  A  13  lasst  sich  beweisen: 

(32)  -,Koll(«,  b,  c)  &  Zw(b;  a,  d)  &  Zw(e;  b,  c,)  -> 

->  (Ex)(Ko\l(e,  d,  x)  &  Zvi(x;  a,c))t 

dh.  das  Axiom  von  Pasch  in  der  engeren  Veblen'schen  Fassung.— 

Anschliessend  sei  noch  die  folgende  Definition  von  Kg  mittels  der  Be- 
griffe  7s  und  Zw  erwahnt,  welche  auf  einer  Konstruktion  von  Euklid 
beruht  : 

DEFINITION  13.    Kg*(a,  b]  c,  d)  «-»  (Ex)(Ey)(Ez)(Is(x,  a\  c)  & 

&  Zw(y\  a,  x)  &  Zw(z;  c,  x)  &  Is(a\  b,  y)  &  Is(c\  d,  z)  &  Is(x\  yt  z)). 

(Fiir  7s  kann  hier  nach  Belieben  7si  oder  Is%  genommen  werden.) 

Von  einer  Axiomatik  wie  der  hier  geschilderten,  bei  der  die  Kollinea- 
ritat  und  die  Zwischenbeziehung  mit  der  Orthogonalitat  verkoppelt  wird, 
kann  man  freilich  nicht  verlangen,  dass  sie  eine  Absonderung  der  Axiome 


GESTALTUNG  GEOMETRISCHER  AXIOMENS YSTEME  15 

des  Linearen  liefert.  Ferner  1st  die  Anlage  hier  von  vornherein  im  Hin- 
blick  auf  die  Planimetrie  beschrankt,  da  die  Definition  der  Kollinearitat 
im  Mehrdimensionalen  nicht  mehr  anwendbar  ist.  Auch  die  Beschrankung 
auf  die  euklidische  Geometrie  wird  schon  an  fruher  Stelle  eingefiihrt. 
Andererseits  kann  diese  Axiomatisierung  sich  besonders  dafiir  eignen,  die 
grosse  Einfachheit  und  Eleganz  der  Gesetzlichkeit  der  euklidischen  Pla- 
nimetrie hervortreten  zu  lassen. 


Symposium  on  the  Axiomatic  Method 


WHAT  IS  ELEMENTARY  GEOMETRY? 

ALFRED  TARSKI 

Institute  for  Basic  Research  in  Science, 
University  of  California,  Berkeley,  California,   U.S.A. 

In  colloquial  language  the  term  elementary  geometry  is  used  loosely  to 
refer  to  the  body  of  notions  and  theorems  which,  following  the  tradition 
of  Euclid's  Elements,  form  the  subject  matter  of  geometry  courses  in 
secondary  schools.  Thus  the  term  has  no  well  determined  meaning  and 
can  be  subjected  to  various  interpretations.  If  we  wish  to  make  elementa- 
ry geometry  a  topic  of  metamathematical  investigation  and  to  obtain 
exact  results  (not  within,  but)  about  this  discipline,  then  a  choice  of  a 
definite  interpretation  becomes  necessary.  In  fact,  we  have  then  to 
describe  precisely  which  sentences  can  be  formulated  in  elementary 
geometry  and  which  among  them  can  be  recognized  as  valid;  in  other 
words,  we  have  to  determine  the  means  of  expression  and  proof  with 
which  the  discipline  is  provided. 

In  this  paper  we  shall  primarily  concern  ourselves  with  a  conception  of 
elementary  geometry  which  can  roughly  be  described  as  follows:  we 
regard  as  elementary  that  part  of  Euclidean  geometry  which  can  be  formulated 
and  established  without  the  help  of  any  set-theoretical  devices.  l 

More  precisely,  elementary  geometry  is  conceived  here  as  a  theory  with 
standard  formalization  in  the  sense  of  [9].  2  It  is  formalized  within  elc- 


1  The  paper  was  prepared  for  publication  while  the  author  was  working  on  a 
research  project  in  the  foundations  of  mathematics  sponsored  by  the  U.S.  National 
Science  Foundation. 

2  One  of  the  main  purposes  of  this  paper  is  to  exhibit  the  significance  of  notions 
and  methods  of  modern  logic  and  metamathematics  for  the  study  of  the  foundations 
of  geometry.  For  logical  and  metamathematical  notions  involved  in  the  discussion 
consult  [8]  and  [9]  (see  the  bibliography  at  the  end  of  the  paper) .  The  main  meta- 
mathematical result  upon  which  the  discussion  is  based  was  established  in  [7J.  For 
algebraic  notions  and  results  consult  [11]. 

Several  articles  in  this  volume  are  related  to  the  present  paper  in  methods  and 
results.  This  applies  in  the  first  place  to  Scott  [5]  and  Szmielew  [6J,  and  to  some 
extent  also  to  Robinson  [3]. 

16 


WHAT  IS  ELEMENTARY  GEOMETRY?  17 

mentary  logic,  i.e.,  first-order  predicate  calculus.  All  the  variables*,)/,  z,  . . . 
occurring  in  this  theory  are  assumed  to  range  over  elements  of  a  fixed  set ; 
the  elements  are  referred  to  as  points,  and  the  set  as  the  space.  The  logical 
constants  of  the  theory  are  (i)  the  sentential  connectives  —  the  negation 
symbol  -i,  the  implication  symbol  — >,  the  disjunction  symbol  v,  and  the 
conjunction  symbol  A  ;  (ii)  the  quantifiers  —  the  universal  quantifier  A 
and  the  existential  quantifier  V ;  and  (iii)  two  special  binary  predicates  — 
the  identity  symbol  =  and  the  diversity  symbol  ^.  As  non-logical 
constants  (primitive  symbols  of  the  theory)  we  could  choose  any  predi- 
cates denoting  certain  relations  among  points  in  terms  of  which  all 
geometrical  notions  are  known  to  be  definable.  Actually  we  pick  two 
predicates  for  this  purpose:  the  ternary  predicate  ft  used  to  denote  the 
betweenness  relation  and  the  quaternary  predicate  d  used  to  denote  the 
equidistance  relation;  the  formula  fi(xyz)  is  read  y  lies  between  x  and  z 
(the  case  when  y  coincides  with  %  or  z  not  being  excluded),  while  6(xyzu)  is 
read  x  is  as  distant  from  y  as  z  is  from  u. 

Thus,  in  our  formalization  of  elementary  geometry,  only  points  are 
treated  as  individuals  and  are  represented  by  (first-order)  variables. 
Since  elementary  geometry  has  no  set-theoretical  basis,  its  formalization 
does  not  provide  for  variables  of  higher  orders  and  no  symbols  are 
available  to  represent  or  denote  geometrical  figures  (point  sets),  classes 
of  geometrical  figures,  etc.  It  should  be  clear  that,  nevertheless,  we  are 
able  to  express  in  our  symbolism  all  the  results  which  can  be  found  in 
textbooks  of  elementary  geometry  and  which  are  formulated  there  in 
terms  referring  to  various  special  classes  of  geometrical  figures,  such  as 
the  straight  lines,  the  circles,  the  segments,  the  triangles,  the  quadrangles, 
and,  more  generally,  the  polygons  with  a  fixed  number  of  vertices,  as 
well  as  to  certain  relations  between  geometrical  figures  in  these  classes, 
such  as  congruence  and  similarity.  This  is  primarily  a  consequence  of  the 
fact  that,  in  each  of  the  classes  just  mentioned,  every  geometrical  figure 
is  determined  by  a  fixed  finite  number  of  points.  For  instance,  instead  of 
saying  that  a  point  z  lies  on  the  straight  line  through  the  points  x  and  y, 
we  can  state  that  either  ft(xyz)  or  fi(yzx)  or  fi(zxy)  holds;  instead  of  saying 
that  two  segments  with  the  end-points  x,  y  and  x',yr  are  congruent,  we 
simply  state  that  d(xyx'yr).  3 


3  In  various  formalizations  of  geometry  (whether  elementary  or  not)  which  are 
known  from  the  literature,  and  in  particular  in  all  those  which  follow  the  lines  of 
[1],  not  only  points  but  also  certain  special  geometrical  figures  are  treated  'as 


18  ALFRED   TARSKI 

A  sentence  formulated  in  our  symbolism  is  regarded  as  valid  if  it  follows 
(semantically)  from  sentences  adopted  as  axioms,  i.e.,  if  it  holds  in  every 
mathematical  structure  in  which  all  the  axioms  hold.  In  the  present  case, 
by  virtue  of  the  completeness  theorem  for  elementary  logic,  this  amounts 
to  saying  that  a  sentence  is  valid  if  it  is  derivable  from  the  axioms  by 
means  of  some  familiar  rules  of  inference.  To  obtain  an  appropriate  set 
of  axioms,  we  start  with  an  axiom  system  which  is  known  to  provide  an 
adequate  basis  for  the  whole  of  Euclidean  geometry  and  contains  /?  and  d 
as  the  only  non-logical  constants.  Usually  the  only  non-elementary 
sentence  in  such  a  system  is  the  continuity  axiom,  which  contains  second- 
order  variables  X,  Y,  ...  ranging  over  arbitrary  point  sets  (in  addition  to 
first-order  variables  %,  y,  ...  ranging  over  points)  and  also  an  additional 
logical  constant,  the  membership  symbol  e  denoting  the  membership 
relation  between  points  and  point  sets.  The  continuity  axiom  can  be 
formulated,  e.g.,  as  follows: 

A  XY{V  z  A  xy[x  e  X  A  y  e  Y  ->  p(zxy)] 

->  V  w  A  #y  [xEXhyeY-+  p(xuy)]}. 

We  remove  this  axiom  from  the  system  and  replace  it  by  the  infinite 
collection  of  all  elementary  continuity  axioms,  i.e.,  roughly,  by  all  the 
sentences  which  are  obtained  from  the  non-elementary  axiom  if  x  E  X  is 
replaced  by  an  arbitrary  elementary  formula  in  which  %  occurs  free,  and 
y  E  Y  by  an  arbitrary  elementary  formula  in  which  y  occurs  free.  To  fix 
the  ideas,  we  restrict  ourselves  in  what  follows  to  the  two-dimensional 


individuals  and  are  represented  by  first-order  variables;  usually  the  only  figures 
treated  this  way  are  straight  lines,  planes,  and,  more  generally,  linear  subspaccs. 
The  set-theoretical  relations  of  membership  and  inclusion,  between  a  point  and  a 
special  geometrical  figure  or  between  two  such  figures,  arc  replaced  by  the  geo- 
metrical relation  of  incidence,  and  the  symbol  denoting  this  relation  is  included  in 
the  list  of  primitive  symbols  of  geometry.  All  other  geometrical  figures  are  treated 
as  point  sets  and  can  be  represented  by  second-order  variables  (assuming  that  the 
system  of  geometry  discussed  is  provided  with  a  set-theoretical  basis).  This  ap- 
proach has  some  advantages  for  restricted  purposes  of  projective  geometry;  in  fact, 
it  facilitates  the  development  of  projective  geometry  by  yielding  a  convenient 
formulation  of  the  duality  principle,  and  leads  to  a  subsumption  of  this  geometry 
under  the  algebraic  theory  of  lattices.  In  other  branches  of  geometry  an  analogous 
procedure  can  hardly  be  justified;  the  non-uniform  treatment  of  geometrical 
figures  seems  to  be  intrinsically  unnatural,  obscures  the  logical  structure  of  the 
foundations  of  geometry,  and  leads  to  some  complications  in  the  development  of 
this  discipline  (by  necessitating,  e.g.,  a  distinction  between  a  straight  line  and  the 
set  of  all  points  on  this  line). 


WHAT  IS  ELEMENTARY  GEOMETRY?  19 

elementary  geometry  and  quote  explicitly  a  simple  axiom  system  ob- 
tained in  the  way  just  described.  The  system  consists  of  twelve  individual 
axioms,  A1-A2,  and  the  infinite  collection  of  all  elementary  continuity 
axioms,  A 13. 

Al    [IDENTITY  AXIOM  FOR  BETWEENNESS]. 

A  xy[0(xyx)  ->  (x  =  y)] 
A2    [TRANSITIVITY  AXIOM  FOR  BETWEENNESS]. 

A  xyzu[(i(xyu)  A  ft(yzu)  ->  ft(xyz)] 
A3    [CONNECTIVITY  AXIOM  FOR  BETWEENNESS]. 

A  xyzu[p(xyz)  A  f$(xyu)  A  (x  ^  y)  ->  fi(xzu)  v  f$(xuz)] 
A4    [REFLEXIVITY  AXIOM  FOR  EQUIDISTANCE]. 

A  xy[d(xyyx)] 
A5    [IDENTITY  AXIOM  FOR  EQUIDISTANCE]. 

A  xyz[6(xyzz)  ->  (x  =  y)] 
A6    [TRANSITIVITY  AXIOM  FOR  EQUIDISTANCE]. 

A  xyzuvw[d(xyzu)  A  d(xyvw)  ->  d(zuvw)] 
A7    [PASCH'S  AXIOM]. 

A  txyzu  V  v[ft(xtu)  A  ft(yuz)  -+p(xvy)  A  /5(^)] 
A8    [EUCLID'S  AXIOM]. 

A  txyzu  V  vw[fi(xiit)  A  jft(yw2)  A  (A:  ^  w)  ->  p(xzv)  A  p(xyw)  A  fl(vtw)] 
A9    (FIVE-SEGMENT  AXIOM). 

A  ^^'yy'^'w^'f^^y^'y')  A  (5(y2;yy)  A  d(xux'u'}  A  d(yuy'u') 

A  ^(%y^)  A  jff^'y'a:')  A  (*  ^  y)  ->  6(zuz'u')] 
A 10    (AXIOM  OF  SEGMENT  CONSTRUCTION). 

A  xyuv  V  z[f$(xyz)  A  <5(y2wz;)] 
Al  1    (LOWER  DIMENSION  AXIOM). 

V  xyz[^(xyz)  A  -j(yzx)  A  -^(^)] 
A 1 2    (UPPER  DIMENSION  AXIOM)  . 

A  xyzuv[d(xuxv)  A  ^(ywyv)  A  6(zuzv)  A  (u  ^=  v) 

^  p(xyz)  v  P(yzx)  v  0(zxy)] 


20  ALFRED  TARSKI 

A13    [ELEMENTARY   CONTINUITY   AXIOMS].    All   sentences   of   the   form 
A  vw  .  .  .  {V  z  A  xy[<p  A  \p  ->  fi(zxy)]  ->  V  u  A  #y[g?  A  ^ 


z£>A0r0  99  stands  for  any  formula  in  which  the  variables  x,  v,  w,  .  .  .  ,  &«/ 
neither  y  nor  z  nor  u,  occur  free,  and  similarly  for  ip,  with  x  and  y 
interchanged. 

Elementary  geometry  based  upon  the  axioms  just  listed  will  be  denoted 
by  <^2-  In  Theorems  1-4  below  we  state  fundamental  metamathematical 
properties  of  this  theory.  4 

First  we  deal  with  the  representation  problem  for  <^2,  i.e.,  with  the 
problem  of  characterizing  all  models  of  this  theory.  By  a  model  of  $2  we 
understand  a  system  9ft  —  </I,  B,  Dy  such  that  (i)  A  is  an  arbitrary  non- 
empty set,  and  B  and  D  are  respectively  a  ternary  and  a  quaternary 
relation  among  elements  of  A  ;  (ii)  all  the  axioms  of  <f  2  prove  to  hold  in  -JJl 
if  all  the  variables  are  assumed  to  range  over  elements  of  A,  and  the 
constants  /?  and  6  are  understood  to  denote  the  relations  B  and  D,  re- 
spectively. 

The  most  familiar  examples  of  models  of  ^2  (and  ones  which  can 
easily  be  handled  by  algorithmic  methods)  are  certain  Cartesian  spaces 
over  ordered  fields.  We  assume  known  under  what  conditions  a  system 
g  —  <F,  +  ,-,<>  (where  F  is  a  set,  +  and  •  are  binary  operations 
under  which  F  is  closed,  and  <  is  a  binary  relation  between  elements  of  F) 
is  referred  to  as  an  ordered  field  and  how  the  symbols  0,  x  —  y,  x2  are 
defined  for  ordered  fields.  An  ordered  field  3f  will  be  called  Euclidean  if 
every  non-negative  element  in  F  is  a  square;  it  is  called  real  closed  if  it  is 
Euclidean  and  if  every  polynomial  of  an  odd  degree  with  coefficients  in  F 
has  a  zero  in  F.  Consider  the  set  A%  —  F  x  F  of  all  ordered  couples 


4  A  brief  discussion  of  the  theory  ^2  and  its  metamathematical  properties  was 
given  in  [7],  pp.  43  ff.  A  detailed  development  (based  upon  the  results  of  [7])  can  be 
found  in  [4]  —  where,  however,  the  underlying  system  of  elementary  geometry 
differs  from  the  one  discussed  in  this  paper  in  its  logical  structure,  primitive  sym- 
bols, and  axioms. 

The  axiom  system  for  <?2  quoted  in  the  text  above  is  a  simplified  version  of  the 
system  in  [7J,  pp.  55  f.  The  simplification  consists  piimarily  in  the  omission  of 
several  superfluous  axioms.  The  proof  that  those  superfluous  axioms  are  actually 
derivable  from  the  remaining  ones  was  obtained  by  Eva  Kallin,  Scott  Taylor,  and 
the  author  in  connection  with  a  course  in  the  foundations  of  geometry  given  by  the 
author  at  the  University  of  California,  Berkeley,  during  the  academic  year  1956-57. 


WHAT  IS  ELEMENTARY  GEOMETRY?  21 

%  =  <#i,  #2>  with  #1  and  #2  in  F.  We  define  the  relations  B%  and  D% 
among  such  couples  by  means  of  the  following  stipulations  : 

B%(xyz)  if  and  only  if  (xi  —  yi)-(ya  -  z2)  =  (x2  -  y2)-(yi  —  *i), 

0  <  (xi  —  yi)-(yi  —  2:1),  0nd  0  <  (*2  —  y2)-(y2  —  22)  ; 
D9(xyzu)  if  and  only  if  (xi  —  yi)2  +  (*2  —  y2)2  =  (*i  —  ui)2+(z2—  U2)2. 

The  system  $2(1$)  =  <A%,  B%,  Dg)>  is  called  the  (two-dimensional) 
Cartesian  space  over  $.  If  in  particular  we  take  for  $  the  ordered  field  9ft 
of  real  numbers,  we  obtain  the  ordinary  (two-dimensional)  analytic  space 


THEOREM  1  (REPRESENTATION  THEOREM).  For  W,  to  be  a  model  of  <^2  it  is 
necessary  and  sufficient  that  9K  be  isomorphic  with  the  Cartesian  space 
Ea(3f)  over  some  real  closed  field  $. 

PROOF  (in  outline).  It  is  well  known  that  all  the  axioms  of  <^2  hold  in 
62(8?)  and  that  therefore  (£2(3?)  is  a  model  of  ^2.  By  a  fundamental  result 
in  [7],  every  real  closed  field  g  is  elementarily  equivalent  with  the  field  91, 
i.e.,  every  elementary  (first-order)  sentence  which  holds  in  one  of  these 
two  fields  holds  also  in  the  other.  Consequently  every  Cartesian  space 
(£2©)  ovcr  a  real  closed  field  gf  is  elementarily  equivalent  with  E2(9?)  and 
hence  is  a  model  of  ^2;  this  clearly  applies  to  all  systems  2R  isomorphic 
with  S2@)  as  well. 

To  prove  the  theorem  in  the  opposite  direction,  we  apply  methods  and 
results  of  the  elementary  geometrical  theory  of  proportions,  which  has 
been  developed  in  the  literature  on  several  occasions  (see,  e.g.,  [1J,  pp. 
51  if.).  Consider  a  model  Wl  =  <A,  B,  Z)>  of  <^2;  let  z  and  u  be  any  two 
distinct  points  of  A,  and  F  be  the  straight  line  through  z  and  u,  i.e.,  the 
set  of  all  points  x  such  that  B(zux)  or  B(uxz)  or  B(xzu).  Applying  some 
familiar  geometrical  constructions,  we  define  the  operations  +  and  •  on, 
and  the  relation  <  between,  any  two  points  x  and  y  in  F.  Thus  we  say 
that  x  <  y  if  either  x  =  y  or  else  B(xzu)  and  not  B(yxu)  or,  finally, 


5  All  the  results  in  this  paper  extend  (with  obvious  changes)  to  the  w-dimensional 
case  for  any  positive  integer  n.  To  obtain  an  axiom  system  for  tfn  we  have  to  modify 
the  two  dimension  axioms,  Al  1  and  A 12,  leaving  the  remaining  axioms  unchanged; 
by  a  result  in  [5]  ,A1 1  and  A 12  can  be  replaced  by  any  sentence  formulated  in  the 
symbolism  of  &n  which  holds  in  the  ordinary  w-dimensional  analytic  space  but  not 
in  any  m-dimensional  analytic  space  for  m  &  n.  In  constructing  algebraic  models 
for  one-dimensional  geometries  we  use  ordered  abelian  groups  instead  of  ordered 
fields. 


22  ALFRED  TARSKI 

B(zxy)  and  not  B(xzu)  ;  x  +  y  is  defined  as  the  unique  point  v  in  F  such 
that  D(zxyv)  and  either  z  <  x  and  y  <^  t;  or  else  %  <  z  and  v  <  y.  The 
definition  of  #-y  is  more  involved;  it  refers  to  some  points  outside  of  F 
and  is  essentially  based  upon  the  properties  of  parallel  lines.  Using  ex- 
clusively axioms  A  1  -A  12  we  show  that  $  =  <F,  +,  ',  <>>  is  an  ordered 
field;  with  the  help  of  A  13  we  arrive  at  the  conclusion  that  $  is  actually 
a  real  closed  field.  By  considering  a  straight  line  G  perpendicular  to  F  at 
the  point  z,  we  introduce  a  rectangular  coordinate  system  in  3D?  and  we 
establish  a  one-to-one  correspondence  between  points  x,  y,  ...  in  A  and 
ordered  couples  of  their  coordinates  x  —  <#i,  #2),  y  =  <yi,  ^2),  ...  in 
F  x  F.  With  the  help  of  the  Pythagorean  theorem  (which  proves  to  be 
valid  in  ^2)  we  show  that  the  formula 

D(xyst) 
holds  for  any  given  points  x,  y,  ...  in  A  if  and  only  if  the  formula 


holds  for  the  correlated  couples  of  coordinates  x  =  <#i,  #2),  y  =<yi, 
.  .  .  in  F  x  F,  i.e.,  if 


an    analogous   conclusion    is   obtained   for   B(xys).    Consequently,    the 
systems  3R  and  62(8)  are  isomorphic,  which  completes  the  proof. 

We  turn  to  the  completeness  problem  for  <^2-  A  theory  is  called  complete 
if  every  sentence  a  (formulated  in  the  symbolism  of  the  theory)  holds 
either  in  every  model  of  this  theory  or  in  no  such  model.  For  theories 
with  standard  formalization  this  definition  can  be  put  in  several  other 
equivalent  forms;  we  can  say,  e.g.,  that  a  theory  is  complete  if,  for  every 
sentence  or,  either  a  or  -ic1  is  valid,  or  if  any  two  models  of  the  theory  are 
elementarily  equivalent.  A  theory  is  called  consistent  if  it  has  at  least  one 
model;  here,  again,  several  equivalent  formulations  are  known.  If  there 
is  a  model  9K  such  that  a  sentence  holds  in  551  if  and  only  if  it  is  valid  in  the 
given  theory,  then  the  theory  is  clearly  both  complete  and  consistent, 
and  conversely.  The  solution  of  the  completeness  problem  for  $2  is  given 
in  the  following 

THEOREM  2  (COMPLETENESS  THEOREM),  (i)  A  sentence  formulated  in  6°% 
is  valid  if  and  only  if  it  holds  in  (£2  (9ft)  ; 

(ii)  the  theory  $2  is  complete  (and  consistent). 


WHAT  IS  ELEMENTARY  GEOMETRY?  23 

Part  (i)  of  this  theorem  follows  from  Theorem  1  and  from  a  funda- 
mental result  in  [7]  which  was  applied  in  the  proof  of  Theorem  1 ;  (ii)  is  an 
immediate  consequence  of  (i). 

The  next  problem  which  will  be  discussed  here  is  the  decision  problem 
for  $2.  It  is  the  problem  of  the  existence  of  a  mechanical  method  which 
enables  us  in  each  particular  case  to  decide  whether  or  not  a  given  sen- 
tence formulated  in  <^2  is  valid.  The  solution  of  this  problem  is  again 
positive : 

THEOREM  3  (DECISION  THEOREM).  The  theory  #2  is  decidable. 

In  fact,  & 2  is  complete  by  Theorem  2  and  is  axiomatizable  by  its  very 
description  (i.e.,  it  has  an  axiom  system  such  that  we  can  always  decide 
whether  a  given  sentence  is  an  axiom).  It  is  known,  however,  that  every 
complete  and  axiomatizable  theory  with  standard  formalization  is  deci- 
dable (cf.,  e.g.,  [9],  p.  14),  and  therefore  $2  is  decidable.  By  analyzing  the 
discussion  in  [7]  we  can  actually  obtain  a  decision  method  for  $2- 

The  last  metamathematical  problem  to  be  discussed  for  $%  is  the 
problem  of  finite  axiomatizability.  From  the  description  of  <f  2  we  see  that 
this  theory  has  an  axiom  system  consisting  of  finitely  many  individual 
axioms  and  of  an  infinite  collection  of  axioms  falling  under  a  single  axiom 
schema.  This  axiom  schema  (which  is  the  symbolic  expression  occurring 
in  A 13)  can  be  slightly  modified  so  as  to  form  a  single  sentence  in  the 
system  of  predicate  calculus  with  free  variable  first-order  predicates,  and 
all  the  particular  axioms  of  the  infinite  collection  can  be  obtained  from 
this  sentence  by  substitution.  We  briefly  describe  the  whole  situation  by 
saying  that  the  theory  <f  2  is  "almost  finitely  axiomatizable",  and  we  now 
ask  the  question  whether  $2  is  finitely  axiomatizable  in  the  strict  sense, 
i.e.,  whether  the  original  axiom  system  can  be  replaced  by  an  equivalent 
finite  system  of  sentences  formulated  in  $2-  The  answer  is  negative: 

THEOREM  4  (NON-FINITIZABILITY  THEOREM).  The  theory  $2  is  not 
finitely  axiomatizable. 

PROOF  (in  outline).  From  the  proof  of  Theorem  1  it  is  seen  that  the 
infinite  collection  of  axioms  A 1 3  be  can  equivalently  replaced  by  an  infinite 
sequence  of  sentences  So,  . . . ,  Sw,  . . . ;  So  states  that  the  ordered  field  g 
constructed  in  the  proof  of  Theorem  1  is  Euclidean,  and  Sn  for  n  >  0 
expresses  the  fact  that  in  this  field  every  polynomial  of  degree  2n  +  1 
has  a  zero.  For  every  prime  number  p  we  can  easily  construct  an  ordered 


24  ALFRED  TARSUI 

field  $p  in  which  every  polynomial  of  an  odd  degree  2n  +  1  <  p  has  a 
zero  while  some  polynomial  of  degree  p  has  no  zero;  consequently,  if 
2m  +  1  =  p  is  a  prime,  then  all  the  axioms  A1-A12  and  Sn  with  n  <  m 
hold  in  £2® p)  while  Sm  does  not  hold.  This  implies  immediately  that  the 
infinite  axiom  system  A 1 ,  . . . ,  A 1 2,  So,  . . . ,  Sn>  •  •  •  has  no  finite  sub- 
system from  which  all  the  axioms  of  the  system  follow.  Hence  by  a  simple 
argument  we  conclude  that,  more  generally,  there  is  no  finite  axiom 
system  which  is  equivalent  with  the  original  axiom  system  for  $2- 

From  the  proof  just  outlined  we  see  that  $2  can  be  based  upon  an 
axiom  system  Al,  .  .  . ,  A 12,  So,  . .  .,  Sw,  ...  in  which  (as  opposed  to  the 
original  axiom  system)  each  axiom  can  be  put  in  the  form  of  either  a 
universal  sentence  or  an  existential  sentence  or  a  universal-existential 
sentence;  i.e.,  each  axiom  is  either  of  the  form 

A  xy  . . .  (<p) 
or  else  of  the  form 

V  uv  . . .  ((p) 
or,  finally,  of  the  form 

A  xy  . . .  V  uv  .  . .  (<p) 

where  <p  is  a  formula  without  quantifiers.  A  rather  obvious  consequence 
of  this  structural  property  of  the  axioms  is  the  fact  that  the  union  of  a 
chain  (or  of  a  directed  family)  of  models  of  <^2  is  again  a  model  of  $2-  This 
consequence  can  also  be  derived  directly  from  the  proof  of  Theorem  1 . 

The  conception  of  elementary  geometry  with  which  we  have  been 
concerned  so  far  is  certainly  not  the  only  feasible  one.  In  what  follows  we 
shall  discuss  briefly  two  other  possible  interpretations  of  the  term 
"elementary  geometry" ;  they  will  be  embodied  in  two  different  formalized 
theories,  <f  2'  and  <f  2" '• 

The  theory  $2  is  obtained  by  supplementing  the  logical  base  of  $2 
with  a  small  fragment  of  set  theory.  Specifically,  we  include  in  the 
symbolism  of  <£V  new  variables  X,  Y ,  . . .  assumed  to  range  over  arbitrary 
finite  sets  of  points  (or,  what  in  this  case  amounts  essentially  to  the  same, 
over  arbitrary  finite  sequences  of  points) ;  we  also  include  a  new  logical 
constant,  the  membership  symbol  e,  to  denote  the  membership  relation 
between  points  and  finite  point  sets.  As  axioms  for  <£V  we  again  choose 
A 1 -A  13;  it  should  be  noticed,  however,  that  the  collection  of  axiom  A 13 


WHAT  IS  ELEMENTARY  GEOMETRY?  25 

is  now  more  comprehensive  than  in  the  case  of  $2  since  <p  and  y  stand  for 
arbitrary  formulas  constructed  in  the  symbolism  of  <^y.  In  consequence 
the  theory  &%  considerably  exceeds  <f  2  in  means  of  expression  and  power. 
In  $2  we  can  formulate  and  study  various  notions  which  are  traditionally 
discussed  in  textbooks  of  elementary  geometry  but  which  cannot  be 
expressed  in  $2',  e.g.,  the  notions  of  a  polygon  with  arbitrarily  many 
vertices,  and  of  the  circumference  and  the  area  of  a  circle. 

As  regards  metamathematical  problems  which  have  been  discussed 
and  solved  for  $2  in  Theorems  1-4,  three  of  them  —  the  problems  of 
representation,  completeness,  and  finite  axiomatizability  —  are  still  open 
when  referred  to  <^2'.  In  particular,  we  do  not  know  any  simple  character- 
ization of  all  models  of  $2,  nor,  do  we  know  whether  any  two  such 
models  are  equivalent  with  respect  to  all  sentences  formulated  in  $2  - 
(When  speaking  of  models  of  <^y  we  mean  exclusively  the  so-called 
standard  models;  i.e.,  when  deciding  whether  a  sentence  a  formulated  in 
$2'  holds  in  a  given  model,  we  assume  that  the  variables  x,  y,  ...  oc- 
curring in  a  range  over  all  elements  of  a  set,  the  variables  X,  Y,  ...  range 
over  all  finite  subsets  of  this  set,  and  e  is  always  understood  to  denote  the 
membership  relation) .  The  Archimedean  postulate  can  be  formulated  and 
proves  to  be  valid  in  <^y.  Hence,  by  Theorem  1,  every  model  of  <^y  is 
isomorphic  with  a  Cartesian  space  62®)  over  some  Archimedean  real 
closed  field  $.  There  are,  however,  Archimedean  real  closed  fields  $  such 
that  62©)  is  n°t  a  niodel  of  $2  ',  e.g.,  the  field  of  real  algebraic  numbers  is 
of  this  kind.  A  consequence  of  the  Archimedean  postulate  is  that  every 
model  of  6*2  has  at  most  the  power  of  the  continuum  (while,  if  only  by 
virtue  of  Theorem  1,  $2  has  models  with  arbitrary  infinite  powers).  In 
fact,  $2  has  models  which  have  exactly  the  power  of  the  continuum,  e.g., 
&2(ffi),  but  it  can  also  be  shown  to  have  denumerable  models.  Thus, 
although  the  theory  $2  may  prove  to  be  complete,  it  certainly  has  non- 
isomorphic  models  and  therefore  is  not  categorical.  6 

Only  the  decision  problem  for  $2  has  found  so  far  a  definite  solution : 


8  These  last  remarks  result  from  a  general  metamathematical  theorem  (an 
extension  of  the  Skolem-Lowenheim  theorem)  which  applies  to  all  theories  with  the 
same  logical  structure  as  <£V,  i.e.,  to  all  theories  obtained  from  theories  with  stan- 
dard formalization  by  including  new  variables  ranging  over  arbitrary  finite  sets  and 
a  new  logical  constant,  the  membership  symbol  e,  and  possibly  by  extending 
original  axiom  systems.  By  this  general  theorem,  if  &"  is  a  theory  of  the  class  just 
described  with  at  most  ft  different  symbols,  and  if  a  mathematical  system  9JI  is  a 


26  ALFRED  TARSKI 

THEOREM  5.  The  theory  #2'  is  undecidable,  and  so  are  all  its  consistent 
extensions. 

This  follows  from  the  fact  that  Peano's  arithmetic  is  (relatively)  inter- 
pretable  in  <f2';  cf.  [9],  pp.  31  ff. 

To  obtain  the  theory  $2"  we  leave  the  symbolism  of  $2  unchanged  but 
we  weaken  the  axiom  system  of  £2-  In  fact,  we  replace  the  infinite 
collection  of  elementary  continuity  axioms,  A 13,  by  a  single  sentence, 
A 13',  which  is  a  consequence  of  one  of  these  axioms.  The  sentence  ex- 
presses the  fact  that  a  segment  which  joins  two  points,  one  inside  and  one 
outside  a  given  circle,  always  intersects  the  circle;  symbolically: 

A 13'.  A  xyzx'z'u  V  y'[6(uxux')  A  d(uzuz')  A  (t(uxz)  A  ft(xyz) 

-v  d(uyuy')  A  ftx'y'z')] 

As  a  consequence  of  the  weakening  of  the  axiom  system,  various 
sentences  which  are  formulated  and  valid  in  $2  are  no  longer  valid  in  $2". 
This  applies  in  particular  to  existential  theorems  which  cannot  be  esta- 
blished by  means  of  so-called  elementary  geometrical  constructions 
(using  exclusively  ruler  and  compass),  e.g.,  to  the  theorem  on  the  tri- 
section  of  an  arbitrary  angle. 

With  regard  to  metamathematical  problems  discussed  in  this  paper  the 
situation  in  the  case  of  $2"  is  just  opposite  to  that  encountered  in  the 
case  of  <^y.  The  three  problems  which  are  open  for  <f 2'  admit  of  simple 
solutions  when  referred  to  ^2".  In  particular,  the  solution  of  the  repre- 
sentation problem  is  given  in  the  following 


standard  model  of  y  with  an  infinite  power  a,  then  9Ji  has  subsystems  with  any 
infinite  power  y,  p<y  <<x,  which  are  also  standard  models  of  y.  The  proof  of  this 
theorem  (recently  found  by  the  author)  has  not  yet  been  published;  it  differs  but 
slightly  from  the  proof  of  the  analogous  theorem  for  the  theories  with  standard 
formalization  outlined  in  [10],  pp.  92  f.  In  opposition  to  theories  with  standard 
formalization,  some  of  the  theories  &~  discussed  in  this  footnote  have  models  with 
an  infinite  power  a  and  with  any  smaller,  but  with  no  larger,  infinite  power;  an 
example  is  provided  by  the  theory  &%'  for  which  a  is  the  power  of  the  continuum. 
In  particular,  some  of  the  theories  y  have  exclusively  denumerable  models  and  in 
fact  are  categorical;  this  applies,  e.g.,  to  the  theory  obtained  from  Peano's  arith- 
metic in  exactly  the  same  way  in  which  ^V  has  been  obtained  from  $%.  There  are 
also  theories  y  which  have  models  with  arbitrary  infinite  powers;  such  is,  e.g.,  the 
theory  <f 2'"  mentioned  at  the  end  of  this  paper. 


WHAT  IS  ELEMENTARY  GEOMETRY?  27 

THEOREM  6.  For  2ft  to  be  a  model  of  $2"  it  is  necessary  and  sufficient  that 
$R  be  isomorphic  with  the  Cartesian  space  62(8)  over  some  Euclidean  field  f^f. 

This  theorem  is  essentially  known  from  the  literature.  The  sufficiency 
of  the  condition  can  be  checked  directly;  the  necessity  can  be  established 
with  the  help  of  the  elementary  geometrical  theory  of  proportions  (cf  .  the 
proof  of  Theorem  1). 

Using  Theorem  6  we  easily  show  that  the  theory  <f  2"  is  incomplete, 
and  from  the  description  of  ^2"  we  see  at  once  that  this  theory  is  finitely 
axiomatizable. 

On  the  other  hand,  the  decision  problem  for  <£y  remains  open  and 
presumably  is  difficult.  In  the  light  of  the  results  in  [2]  it  seems  likely  that 
the  solution  of  this  problem  is  negative  ;  the  author  would  risk  the  (much 
stronger)  conjecture  that  no  finitely  axiomatizable  subtheory  of  <^2  is 
decidable.  If  we  agree  to  refer  to  an  elementary  geometrical  sentence  (i.e., 
a  sentence  formulated  in  $  2)  as  valid  if  it  is  valid  in  $2,  and  as  elementarily 
provable  if  it  is  valid  in  $2",  then  the  situation  can  be  described  as 
follows  :  we  know  a  general  mechanical  method  for  deciding  whether  a  given 
elementary  geometrical  sentence  is  valid,  but  we  do  not,  and  probably  shall 
never  know,  any  such  method  for  deciding  whether  a  sentence  of  this  sort  is 
elementarily  provable. 

The  differences  between  $  2  and  «f  2"  vanish  when  we  restrict  ourselves 
to  universal  sentences.  In  fact,  we  have 


THEOREM  7.  A  universal  sentence  formulated  in  $2  is  valid  in  $2  if 
only  if  it  is  valid  in  $2". 

To  prove  this  we  recall  that  every  ordered  field  can  be  extended  to  a 
real  closed  field.  Hence,  by  Theorems  1  and  6,  every  model  of  $2"  can  be 
extended  to  a  model  of  £%.  Consequently,  every  universal  sentence 
which  is  valid  in  $2  is  also  valid  in  $2"  \  the  converse  is  obvious.  (An  even 
simpler  proof  of  Theorem  7,  and  in  fact  a  proof  independent  of  Theorem  1  , 
can  be  based  upon  the  lemma  by  which  every  finite  subsystem  of  an 
ordered  field  can  be  isomorphically  embedded  in  the  ordered  field  of  real 
numbers.) 

Theorem  7  remains  valid  if  we  remove  A  13'  from  the  axiom  system  of 
$2"  (and  it  applies  even  to  some  still  weaker  axiom  systems).  Thus  we  see 
that  every  elementary  universal  sentence  which  is  valid  in  $2  can  be 
proved  without  any  help  of  the  continuity  axioms.  The  result  extends  to 


28  ALFRED  TARSKI 

all  the  sentences  which  may  not  be  universal  when  formulated  in  <^2  but 
which,  roughly  speaking,  become  universal  when  expressed  in  the 
notation  of  Cartesian  spaces  (£2$)- 

As  an  immediate  consequence  of  Theorems  3  and  7  we  obtain: 

THEOREM  8.  The  theory  $2"  is  decidable  with  respect  to  the  set  of  its 
universal  sentences. 

This  means  that  there  is  a  mechanical  method  for  deciding  in  each 
particular  case  whether  or  not  a  given  universal  sentence  formulated  in 
the  theory  $2"  holds  in  every  model  of  this  theory. 

We  could  discuss  some  further  theories  related  to  $2,  &2,  and  <f2"; 
e.g.,  the  theory  <£y"  which  has  the  same  symbolism  as  <f2'  and  the  same 
axiom  system  as  <f2".  The  problem  of  deciding  which  of  the  various 
formal  conceptions  of  elementary  geometry  is  closer  to  the  historical 
tradition  and  the  colloquial  usage  of  this  notion  seems  to  be  rather 
hopeless  and  deprived  of  broader  interest.  The  author  feels  that,  among 
these  various  conceptions,  the  one  embodied  in  <^2  distinguishes  itself  by 
the  simplicity  and  clarity  of  underlying  intuitions  and  by  the  harmony 
and  power  of  its  metamathematical  implications. 


Bibliography 

[1]  HILBERT,  D.,  Grundlagen  der  Geometrie.  Eighth  edition,  with  revisions  and 
supplements  by  P.  BERNAYS,  Stuttgart  1956,  III -f- 251  pp. 

[2]  ROBINSON,  J.,  Definability  and  decision  problems  in  arithmetic.  Journal  of 
Symbolic  Logic,  vol.  14  (1949),  pp.  98-114. 

[3]  ROBINSON,  R.  M.,  Binary  relations  as  primitive  notions  in  elementary  geometry 
This  volume,  pp.  68-85. 

[4]  SCHWABHAUSER,  W.,  Vber  die  Vollstdndigkeit  der  elementaren  euklidischen 
Geometrie.  Zeitschrift  fur  mathematische  Logik  und  Grundlagen  der  Mathc- 
matik,  vol.  2  (1956),  pp.  137-165. 

[5]  SCOTT,  D.,  Dimension  in  elementary  Euclidean  geometry.  This  volume,  pp. 
53-67. 

[6]  SZMIELEW,  W.,  Some  metamathematical  problems  concerning  elementary  hyper- 
bolic geometry.  This  volume,  pp.  30-52. 

[7]  TARSKI,  A.,  A  decision  method  for  elementary  algebra  and  geometry.  Second 
edition,  Berkeley  and  Los  Angeles  1951,  VI -f  63  pp. 

[8]  Contributions  to  the  theory  of  models.  Indagationes  Mathematicae,  vol. 

16  (1954),  pp.  572-588,  and  vol.  17  (1955),  pp.  56-64. 


WHAT  IS  ELEMENTARY  GEOMETRY?  29 

[9]    MOSTOWSKI,  A.,  and  ROBINSON,  R.  M.,  Undeciddble  theories.  Amsterdam 

1953,  XI +  98  pp. 
[10]    and  VAUGHT,  R.  L.,  Arithmetical  extensions  of  relational  systems.  Com- 

positio  Mathematica,  vol.  13  (1957),  pp.  81-102. 
[11]    VAN  DER  WAERDEN,  B.  L.,  Modern  Algebra.  Revised  English  edition,  New 

York,  1953,  vol.  1,  XII +  264  pp. 


Symposium  on  the  Axiomatic  Method 


SOME  METAMATHEMATICAL  PROBLEMS  CONCERNING 
ELEMENTARY  HYPERBOLIC  GEOMETRY 

WANDA  SZMIELEW 

University  of  Warsaw,  Warsaw,  Poland,  and  Institute  for  Basic  Research  in  Science, 
University  of  California,   Berkeley,   California,    U.S.A. 

Introduction.  In  this  paper  we  shall  be  concerned  with  a  formalized 
system  J^n  of  elementary  n-dimensional  hyperbolic  (Bolyai-Lobachevskiari) 
geometry.  Throughout  the  paper  we  shall  use  the  notation  introduced  by 
Tarski  in  [5].  In  particular  the  system  3Pn  has  the  same  logical  structure 
and  the  same  symbolism  as  Tarski's  system  $2  of  elementary  Euclidean 
geometry.  In  case  n  =  2  it  differs  from  <f 2  only  in  that  Euclid's  axiom, 
A8,  has  been  replaced  by  its  negation;  for  n  >  2  the  dimension  axioms, 
Al  1  and  A 12,  should,  in  addition,  be  appropriately  modified.  The  aim  of 
this  paper  is  to  extend  to  the  system  3tf n  the  fundamental  metamathe- 
matical  results  stated  in  [5]  for  the  system  ^2-  1 

The  paper  is  divided  into  three  sections.  In  Section  1  we  shall  indicate 
how  the  solutions  of  the  metamathematical  problems  in  which  we  are 
interested  can  be  obtained  by  means  of  a  familiar  algorithm,  the  end- 
calculus  of  Hilbert  (cf.  [1]  ,  pp.  159ff.).  In  Section  2  we  shall  construct 
a  new  geometrical  algorithm,  the  hyperbolic  calculus  of  segments,  which 
will  prove  to  provide  a  convenient  apparatus  for  a  new  solution  of  the 
same  problems.  The  results  established  in  Sections  1  and  2  have  inter- 
esting implications  for  some  related  geometrical  systems,  in  fact,  for 
elementary  absolute  geometry  (i.e.,  the  common  part  of  elementary 
Euclidean  and  hyperbolic  geometries)  and  for  non-elementary  hyperbolic 
geometry.  These  implications  will  be  discussed  in  Section  3.  2 

1.  Hilbert-Sz&sz  Spaces.  In  [1]  Hilbert  gives  an  outline  of  his  end- 
calculus,  defines  in  its  terms  the  coordinates  of  straight  lines  and  points 


1  The  results  of  this  paper  were  obtained  while  the  author  was  working  in  the 
University  of  California,  Berkeley  on  a  research  project  in  the  foundations  of  mathe- 
matics sponsored  by  the  U.S.  National  Science  Foundation. 

2  All  the  observations  which  will  be  given  in  Section   1  and  those  concerning 
absolute  geometry  in  Section  3  have  been  made  jointly  by  Tarski  and  the  author. 

30 


ELEMENTARY  HYPERBOLIC  GEOMETRY  31 

and  establishes  an  analytic  condition  for  a  point  to  lie  on  a  straight  line. 
The  whole  discussion  is  done  in  a  system  of  hyperbolic  geometry  included 
in  ^2-  In  [3]  Szasz  somewhat  modifies  Hilbert's  construction  and  moreover 
establishes  an  analytic  formula  for  the  distance  between  two  points. 
The  latter  formula  is  essential  for  our  purposes  and  therefore  we  shall 
refer  in  what  follows  to  [3]  and  not  to  [1]. 

The  discussion  in  [3]  leads  to  an  important  class  of  models  of  ^2  which 
can  be  obtained  by  means  of  the  following  algebraic  construction:  Con- 
sider an  arbitrary  ordered  field  ft  =  <f,  +,  -,  <>.  For  any  ordered 
triples  x  =  <#i,  x2,  #3)  and  y  =  <yi,  yz,  yz>  in  F  X  F  X  F  let 

0(x,  y)  ==  xi-yi  +  x2-y2  —  xz-y*. 

By  A  %  we  denote  the  subset  of  F  x  F  x  F  consisting  of  all  those  triples 
x,  for  which 

0(x,  x)  =  —  1  and  #3  >  0. 

By  Bft  (the  betweenness  relation)  we  denote  the  ternary  relation  which 
holds  among  the  triples  x,  y,  z  in  A%  if  and  only  if 

0(u,u)  >  0,  0(u,x)  —  0,  0(u,y)  =  0,  0(u,z)  =  0  for  some  u  e  F  x  F  x  F 
and  moreover 

0(x9  y)  ^  0(x,  z)  and  0(y,  z)  ^  0(x,  z). 

Finally,  by  D%  (the  equidistance  relation)  we  denote  the  quaternary 
relation  which  holds  among  the  triples  x,  y,  z,  u  in  A$  if  and  only  if 

0(x,y)  =0(z,u). 

The  system  <A%,  B^,  D^y  thus  obtained  will  be  denoted  by  §2(8)  anc* 
will  be  referred  to  as  the  two-dimensional  Hilbert-Szdsz  space  over  the 
field  ft. 

As  a  direct  consequence  of  Szasz'  discussion  the  following  result  is 
obtained:  Every  model  of  Jf'2  is  isomorphic  with  the  space  §2©)  over 
some  Euclidean  field  ft.  By  supplementing  the  argument  of  Szasz  we 
easily  show,  by  means  of  the  elementary  continuity  axioms,  A 13  (see 
[5],  p.  20),  that  the  field  ft  is  real  closed.  Since  Jf  2  has  a  model,  then  for 
some  real  closed  field  ft  the  space  i$2(ft)  is  a  model  of  Jf  2-  And  since,  by  a 
fundamental  result  of  Tarski  in  [4],  any  two  real  closed  fields  ft'  and  ft" 
are  elementarily  equivalent,  so  are  also  spaces  $t)2(ft')  and  $2(ft"),  and 


32  WANDA   SZMIELEW 

consequently  each  of  the  spaces  §2®)  i§  a  model  of  ,#2.  This  clearly 
applies  to  all  the  systems  isomorphic  with  Jp2($)  as  well.  Thus  we  have 
arrived  at  the  following 

THEOREM  1.1.  (REPRESENTATION  THEOREM).  A  system  90?  =  <A,  B,  D> 
is  a  model  of  ^2  if  and  only  if  it  is  isomorphic  with  the  Hilbert-Szdsz  space 
§2(3r)  =  <A<g,  B%,  Z)^>  over  some  real  closed  field  g. 

Theorem  1 . 1  implies  as  a  corollary 

THEOREM  1 .2.  The  theory  Jtifz  is  complete  and  decidable  but  not  finitely 
axiomatizable. 

The  proof  of  Theorem  1.2  is  quite  analogous  to  the  proof  of  the  corre- 
sponding results  (Theorems  2,  3,  4)  for  the  system  ^2  in  [5]. 

In  this  way  we  have  established  the  fundamental  metamathematical 
properties  of  two-dimensional  elementary  hyperbolic  geometry.  The 
extension  of  these  results  to  w-dimensional  geometries  does  not  seem  to 
present  any  essential  difficulty. 

2.  Klein  Spaces  and  Hyperbolic  Calculus  of  Segments.  In  this  section  we 
wish  to  establish  fundamental  metamathematical  properties  of  C/C n  by 
using  in  the  representation  theorem,  instead  of  the  Hilbert-Szasz  models, 
the  much  more  familiar  and  intuitively  simpler  Klein  models.  We  could 
try  to  derive  the  new  representation  theorem  from  the  old  one  by  showing 
in  a  purely  algebraic  way  that  every  Klein  model  is  isomorphic  with  some 
Hilbert-Szasz  model,  and  conversely.  We  prefer,  however,  to  obtain  this 
result  by  means  of  a  direct  procedure,  and  to  this  end  we  construct  a 
special  geometrical  algorithm,  which  will  be  called  the  hyperbolic  calculus 
of  free  segments.  This  algorithm  seems  to  present  some  geometrical 
interest  independent  of  any  metamathematical  applications  and  to  be 
conceptually  simpler  than  the  end-calculus  of  Hilbert. 

Consider  a  model  9CR  —  </l ,  B,  Dy  of  Jf  *n  (n  ^  2)  formed  by  an  arbitrary 
set  A,  a  ternary  relation  B  (the  betweenness  relation),  and  a  quaternary 
relation  D  (the  equidistance  relation)  among  elements  (points)  of  A .  By  a 
segment  we  understand  any  non-ordered  couple  pq  of  two  distinct  points 
p,  q  in  A.  Two  segments  pq  and  rs  are  congruent  (in  symbols,  Pq^.  rs)  if 
and  only  if  D(pqrs).  The  set  of  all  segments  congruent  to  a  given  segment 
pq  is  called  the  free  segment  determined  by  pq  and  is  denoted  by  [pq].  Free 
segments  will  be  represented  by  variables  X,  Y,  Z,  ...  and  the  set  of  all 
free  segments  will  be  denoted  by  5.  We  wish  to  define  a  binary  relation 


ELEMENTARY  HYPERBOLIC  GEOMETRY 


33 


^  between  elements  of  S  and  two  binary  operations  +  and  •  on  elements 
of  S  in  such  a  way  that  the  rectangular  coordinates  introduced  on  the 
base  of  the  resulting  calculus  function  as  the  Beltrami  coordinates,  and, 
in  fact,  lead  to  Klein  model. 

To  obtain  appropriate  definitions  let  us  assume  for  a  while  that  9ft  is  a 
model,  not  only  of  ^2,  but  of  full  two-dimensional  hyperbolic  geometry 
with  the  non-elementary  axiom  of  continuity  (e.g.  the  ordinary  Klein 
model) .  As  is  well  known,  in  such  a  model  9K  we  can  correlate  with  every 
angle  PQ  a  real  number  p(PQ),  0  <  p(PQ)  <  n,  called  the  measure  of  PQ. 
The  angle  PQ  is  understood  here  as  the  non-ordered  pair  of  half-lines  P 
and  Q  which  are  supposed  to  be  non-collinear  and  to  have  a  common 
origin.  Hence  we  can  define  in  -JR  the  Lobachevskian  function  77,  which 

n 
assigns  a  real  number  H(X}t  0  <  II(X}  <       ,  to  every  free  segment  X. 

In  fact,  given  an  oriented  straight  line  L  (Figure  1),  a  point  p  not  on  L, 


A7(X) 


Fig.  1 


the  perpendicular  projection  q  of  p  upon  L,  and  the  half-line  P  with 
origin  p  and  parallel  to  L,  if  Q  is  the  half-line  fq  and  X  =  [pq],  then 
TI(X)  =  p(PQ).  The  Beltrami  coordinates  of  an  arbitrary  point  p  of  the 
model  SO1?,  if  different  from  0,  are  numbers  of  the  form  ±coslJ(X), 
±  cos/7(Y),  where  X  and  Y  are  two  free  segments  correlated  with 
point  p.  Using  this  fact  we  define  the  relation  <  and  the  operations  + 


34 


WANDA  SZMIELEW 


and  •  for  elements  of  S  by  the  following  conditions 
(I)  X  :<  Y  if  and  only  if     cos  H(X)  <  cos  IJ(Y), 

(II)  X  +  Y  =  Z     if  and  only  if     cos  H(X)  +  cos  77(  Y)  =  cos  77(Z), 
(III)  X  •    Y  =  Z     if  and  only  if     cos  U(X)    -   cos  77(Y)  =  cos  U(Z). 

We  shall  show  that  these  definitions  can  be  replaced  by  equivalent 
ones  formulated  entirely  in  terms  of  the  relations  B  and  D.  This  will 
make  it  possible  to  extend  the  definitions  to  an  arbitrary  model  9K. 

Relation  ^.  In  view  of  (I)  (since  both  functions,  cos  in  the  interval 

f  0,  —  J  and  77,  are  decreasing)  :<  is  the  ordinary  less  than  or  equal  to 
relation ;  speaking  precisely 

(I')     X  :<  y  if  and  only  if  B(pqr),  [pq]  =  X,  and  [pr]  =  Y,  for  some 
p,q,r  e  A. 

As  usual  the  symbol  ^  will  denote  the  relation  converse  to  ^. 

In  defining  the  operations  +  and  •,  and  in  deducing  their  fundamental 
properties  we  shall  use  the  notions  of  a  proper  or  improper  right  triangle 
and  of  a  proper  or  improper  right  quadrangle  (i.e.,  a  quadrangle  with 
three  right  angles).  For  our  purposes  it  is  convenient  to  introduce  these 
notions  in  the  following  way: 

Given  three  non-collinear  points  p,  q,  r,  we  say  that  the  ordered  triple 
pqr  is  a  (proper)  right  triangle  if  and  only  if  <£  pqr  is  a  right  angle. 


Fig.  2 

Given  two  distinct  points  p  and  q,  a  half-line  P  with  origin  p,  and  a 
half -line  Q  with  origin  q  (Figure  2),  we  say  that  the  ordered  quadruple 


ELEMENTARY  HYPERBOLIC  GEOMETRY 


35 


PpqQ  is  an  improper  right  triangle  if  and  only  if  the  half-lines  qp  and  Q 
form  a  right  angle  and  P\\Q.  It  is  clear  that  points  p  and  q  uniquely 
determine  half -lines  P  and  Q. 

Given  four  points  p,  q,  r,  s,  no  three  of  which  are  collinear,  we  say  that 
the  ordered  quadruple  Pqrs  is  a  (proper)  right  quadrangle  if  and  only  if 
<£  spq,  <£  pqr,  and  <£  qrs  are  three  right  angles.  It  is  clear  that  there  are 
non-coil inear  points  p,  q,  r,  such  that  <£  pqr  is  a  right  angle  and  for  which 
there  is  no  point  s  such  that  Pqrs  is  a  right  quadrangle. 

Given  three  distinct  points 
p,  q,  r,  a  half -line  P  with 
origin  p,  and  a  half-line  R 
with  origin  r  (Figure  3),  we 
say  that  the  ordered  quin- 
tuple PpqrR  is  an  improper 
right  quadrangle  if  and  only 
if  half-lines  P  and  pq,  qp 
and  qr,  rq  and  R  form  three 
right  angles  and  P\\R.  It  is 
clear  that  points  p  and  q 
determine  uniquely  the  half- 
line  P,  the  point  r,  and 
the  half-line  R. 

Before  defining  the  operations  $  and  •  in  terms  of  the  relations  B  and  D 
we  first  introduce  four  auxiliary  operations  on  elements  of  5,  in  fact,  two 
binary  operations,  $  and  0,  and  two  unary  operations,  R  and  C. 


Fig.  3 


36 


WANDA  SZMIELEW 


Operation  $.  Given  two  free  segments  X  and  Y,  consider  the  free  seg- 
ment Z  constructed  in  the  following  way:  For  some  right  quadrangle 
pqrst  let  X  =  [pq],  Y  =  [qr],  and  Z  =  [qs]  (Figure  4).  Clearly,  the  seg- 
ment Z  thus  defined  not  always  exists  (since  the  right  quadrangle  pqrs 
not  always  can  be  constructed).  If  however  Z  exists,  it  is  uniquely 
determined  by  X  and  Y  (independent  of  the  choice  of  pqrs)  and  we  then 
put  X  $  Y  =  Z.  To  express  the  fact  that  X  $  Y  does,  or  does  not 
exist,  we  shall  respectively  write  X  ®Y  E  S,  X  ®  Y  <£  S. 

Operation  0.  3  Given  two  free  segments  X  and  Y,  we  consider  a  right 
triangle  pqr  with  X  =  [pq]  and  Y  =  [qr]  (Figure  5),  and  we  put  X  0  Y  = 


X&  Y 


[pr].  The  operation  ©  thus  defined  is  always  performable,  i.e.,  we  have 
X  0  Y  e  S  for  any  X,  Y  e  S. 

It  is  worth  while  to  notice  that  both  the  operations  $  and  0  have 
sense  in  absolute  geometry  and  that  they  coincide  in  Euclidean  geometry. 

Operation  R.  Given  a  free  segment  X,  we  consider  an  isoceles  right 
triangle  pqr  with  X  =  [pr]  (Figure  6),  and  we  put  RX  =  [pq]  =  [pr]. 
Clearly  the  operation  R  is  always  performable.  RX  can  be  referred  to 
as  the  square  root  of  X. 

Operation  C.  Given  a  free  segment  X,  we  consider  an  improper  right 
quadrangle  PpqrR  with  X  =  [pq]  (Figure  7),  and  we  put  CX  —  [qr]. 


3  This  operation  was  studied  by  Hjelmslev  in  [2]. 


ELEMENTARY  HYPERBOLIC  GEOMETRY 


37 


Obviously  the  operation  C  is  always  performable.  CX  can  be  referred  to 
as  the  complement  of  X. 


RX 


II  (CX) 


3  CX  r 

Fig.  7 

Clearly,  the  four  operations  just  defined  can  be  characterized  in  terms 
of  the  primitive  relations  B  and  D. 

Using  some  well  known  theorems  of  hyperbolic  geometry  we  can 


38  WANDA   SZMIELEW 

easily  establish  in  9R  the  formulas 

cos2  H(X)  +  cos2  77(  Y)  =  coS2  H(X  $  Y), 
sin  H(X)  •  sin  77(  Y)  =  sin  II  (X  0  Y), 


By  definitions  (II)  and  (III)  these  formulas  imply  at  once  the  following 
equivalences  : 

(II')    X  +  Y  =  Z    if  and  only  if    CRCX  $  CRCY  =  CRCZ, 

(III')    X  •    Y  =  Z    if  and  only  if  CX  0  CY  =  CZ. 

We  now  return  to  the  original  model  9K  of  3J?n.  In  this  model  we 
introduce  the  auxiliary  operations  $,  0,  jR,  C  (in  the  definitions  of  $ 
and  /?  it  should  be  additionally  mentioned  that  ^grs  and  PpqrR  are  qua- 
drangles on  a  plane)  and  assume  equivalences  (I'),  (II'),  (III')  as  defi- 
nitions of  ^,  +,  •. 

We  shall  now  establish  the  fundamental  properties  of  the  system 
@  =  <5,  +,  •,  ^>.  A  detailed  discussion  will  be  given  only  for  the  case 
n  >  3,  thus  using  (when  needed)  three-dimensional  constructions.  Some 
remarks  concerning  the  case  n  =  2  will  be  given  later. 

In  Lemma  2.  1  we  state  some  fundamental  properties  of  the  relation  ^ 
and  the  auxiliary  operations. 

LEMMA  2.1.  The  system  <5,  $,  0,  R,  C,  ^>  satisfies  the  following 
conditions  : 

(i)     <£,  ^>  is  a  non-empty  simply  ordered  system', 
(ii)     ifX,  Y,X®YeS,thenX$Y=  Y®X\ 
(iii)    if   X,Y,Z,X  QY,    (X  $  Y)  $  Z  e  5,    then    YQZeS   and 

(X9  Y)  <$Z  =  XQ(Y  QZ)', 
(iv)     */  X,  Z  <=  5,  then  X  :<  Z  if  and  only  if  X  =  Z  or  else  X  $  Y  =  Z 

for  some  Y  e  S  ; 

(v)     *y  X,  Y  e  5,  *Aen  X0Ye5am*X0Y=Y0X; 
(vi)    ifX,  Y,ZeS,  then  (X  ®  Y)  0Z  =  X0(Y0Z); 


ELEMENTARY  HYPERBOLIC  GEOMETRY 


39 


(vii)  if  X,ZeS,  then  X  <  Z  if  and  only  if  X  =  Z  or  else  X  0  Y  =  Z 
for  some  Y  e  S ; 

(viii)  if  XeS,  then  RX  e  S  and  RX  0  RX  =  X ; 

(ix)  if  X,  Y  e  S,  then  X  <  Y  if  and  only  if  RX  ^  RY] 

(x)  ifXeS,  then  CXeS  and  CCX  =  X] 

(xi)  if  X,  Y  e  S,  then  X  <  Y  if  and  only  if  CX>CY\ 

(xii)  if  XeS,  then  X®CX$S\ 

(xiii)  if  X}  Z  E  S  and  Z  <  CX,  then  X  $  Z  e  S. 

PROOF.     All  the  postulates  (i)-(xiii)  with  the  exception  of  (iii)  and  (vi) 
result  immediately  from  the  definitions  of  the  notions  involved. 


Fig.  8 

To  derive  Postulate  (iii)  (the  associative  law  for  $)  let  X,  Y ,  Z  E  S  and  let 
p,  q,  r,  s,  t,  u,  w  be  seven  distinct  points  (Figure  8)  satisfying  the  following 
conditions:  (a)  [pq]  =  X,  [pr]  =  Y,  [ps]  =  Z,  and  the  three  segments 
pq>  PV>  PS  are  pairwise  perpendicular,  (/?)  qprt  and  tpsu  are  two  right 
quadrangles,  and  (y)  w  is  the  perpendicular  projection  of  the  point  u  upon 


40 


WANDA  SZMIELEW 


the  straight  line  L  which 
passes  through  the  point  r,  is 
perpendicular  to  the  straight 
line  pr,  and  lies  in  the  plane 
prs.  Then  \pf\  ==  X  $  Y  and 
[pu]  =  (X$Y)®Z.  Further- 
more qpwu  proves  to  be  a 
right  quadrangle  ;  using  this 
fact  rpsw  is  shown  to  be  a 
right  quadrangle  as  well. 
Hence  [pw]  =  Y  Q  Z  and 
[pu]  =  X  $  (Y  ®  Z).  Conse- 
quently (X®Y)®Z  =  X& 
(Y  &  Z),  what  was  to  be 
proved. 

To  derive  Postulate  (vi) 
(the  associative  law  for  0) 
let  X,  y,  Z  6  S  and  let  p, 
q,  r,  s  be  four  distinct  points 
(Figure  9)  satisfying  the 
conditions:  (d)  [pq]  =  X, 
[qr]  =  y,  [rs]  =  Z,  and 
(e)  <£  pqr,  <  prs,  <£  grs  are 
three  right  angles.  Then 

=  X  0  y,  [?s]  =  y  0  Z,  [£s]  =  (X  0  y)  ©  Z,  and  <£  ^s  is  a  right 
angle.   Hence  [ps]  =  X  0  (y  0  Z)  and  consequently    (X  0  y)  0  Z  == 
0  Z).  4  The  proof  of  Lemma  2.  1  has  thus  been  completed. 


Fig.  9 


By  the  next  two  Lemmas  the  discussion  of  the  properties  of  the  oper- 
ations •  and  +  reduces  to  that  of  the  properties  of  the  operations  0  and  $ 
respectively. 

LEMMA  2.2.  The  function  C  maps  the  system  <*S,  •,  :<>  isomorphically 
onto  the  system  <5,  0,  ^>. 

This  lemma  follows  directly  from  Lemma  2.1(x)(xi)  and  the  definition 
of-.. 

LEMMA  2.3.  The  function  CRC  maps  the  system  <5,  +,  •,  ^  >  isomorphi- 
cally onto  the  system  <5,  $,  •,  :<>. 


4  The  argument  used  in  the  proof  of  (vi)  can  be  found  in  [2],  p.  5. 


ELEMENTARY  HYPERBOLIC  GEOMETRY  41 

PROOF.  By  Lemmas  2.2,  2. 1  (viii)  (ix)  and  the  definition  of  +  ,  the 
function  CRC  maps  the  system  <5,  +  ,  —  >  isomorphically  onto  the 
system  <S,  ®,  ^>.  To  complete  the  proof  it  is  sufficient  to  show  that 

(1)  X-Y  =  Z     if  and  only  if     CRCX  •  CRCY  =  CRCZ. 

From  Lemma  2. 1  (v)-(viii)  we  easily  derive  the  formula 
R(X  0  Y)  =  RX  0  RY, 

which,  together  with  the  definition  of  •  and  Lemma  2.1(x),  gives  us  the 
required  equivalence  (1). 

The  next  lemma  provides  a  new  geometrical  construction  by  means  of 
which  the  operation  •  can  be  obtained.  This  lemma  will  eventually  lead 
to  the  distributive  law  for  •  under  -f- ;  it  also  will  be  helpful  in  setting  up 
the  foundations  of  the  theory  of  proportion  (see  Lemma  2. 1 1 ) . 

LEMMA  2.4.  Let  PpqQ  be  an  improper  right  triangle.  Furthermore,  let 
r  e  P  and  let  s  be  the  perpendicular  projection  of  r  upon  the  straight  line 
pq.  Under  these  assumptions,  if  [pq]  =  X  and  [pr]  =  Y,  then  [ps]  = 
X-  Y  (Figure  10). 


Fig.  10 

PROOF.  We  assume  that  [pq]  =  X  and  [pr]  =  Y.  Consider  four  points 
*i  pi>  ?i>  $1  and  three  half-lines  T,  R\,  Si  (Figure  11)  which  satisfy  the 
following  conditions:  (a)  p\  ^  p,  the  straight  line  ppi  is  perpendicular 
to  the  plane  pqr,  and  [ppi]  =  C Y,  and  (ft)  PpptfiRi,  QqptT,  and  T 


42 


WANDA  SZMIELEW 


are  three  improper  right  quadrangles.  Then 

[Mi]  =  Y,  [pt]  =  CX,  [tpd  =  CXQ  CY,  \plS{]  =  X-Y 
and 

P\\Ri,  Q\\T, 


CX*CY 


since  P\\Q,  the  latter  formulas  imply  7?i||Si.  Furthermore,  it  is  easy  to 
check  that  (y)  the  straight  line  pp\  is  perpendicular  to  the  plane 
and  hence  the  angles  spr  and  s\p\r\  are  congruent,  and  that  (d)  <£ 
is  a  right  angle.  Thus  the  triangles  prs  and  Pir\si  are  congruent  and, 
specifically,  segments  ps  and  pi$i  are  congruent.  In  conclusion,  [ps]  = 
X-Y. 

From  Lemma  2.4  we  readily  derive 

LEMMA  2.5.     IfX.Ue  S,  then  X-U  9  CX-U  =  [7. 


ELEMENTARY  HYPERBOLIC  GEOMETRY 


43 


PROOF.  LetX,  UeS.  We  pick  an  improper  right  quadrangle  Qqpq\Qi 
for  which  [pq]  =  X  (Figure  12).  Then  [pq{\  =  CX.  On  the  half-line  P 
with  origin  p  and  parallel  to  the  half-lines  Q  and  Qi  we  choose  a  point  r  in 
such  a  way  that  [pr]  =  U.  Let  5  and  s\  be  perpendicular  projections  of  r 
upon  the  straight  lines  pq  and  />^i.  Then  spsir  is  a  right  quadrangle,  and 


by  Lemma  2.4,  we  have  [ps]  =  X'U  and  [^sj  =  CX'  U.  Hence 
X*U ®  X'CU  =  U,  which  completes  the  proof. 

As  an  immediate  consequence  of  Lemmas  2.1(i)-(vii),  2.2,  2.3,  2.1(xii)- 
(xiii),  and  2.5  we  obtain  the  fundamental  theorem  on  the  calculus  of  free 
segments. 

THEOREM  2.6.     For  every  model  9K  of  J^n  (n^3),  the  system 
@  =    <£,  +,  •,  ^>  satisfies  the  following  conditions'. 

(i)  <$,  ±S  >  w  a  non-empty  simply  ordered  system ; 
(ii)  */  Z,  Y,  Z  +  Y  e  S,  then  X  +  Y  =  Y  +  X] 
(iii)  */  X,  Y,  Z,  X  +  Y,  (X  +  Y)  +  Z  e  S,  then  Y  +  Z  e  S  and 


44  WANDA   SZMIELEW 

(iv)    if  XtZtS,  then  X  <  Z  if  and  only  if  X  =  Z  or  else  X+Y  =  Z 

for  some  Y  E  5; 

(v)    if  X.YeS,  then  X-  Y  e  S  and  X-  Y  =  Y-X] 
(vi)    ifX,  Y,ZeS,then  (X-Y)-Z  =  X-(Y-Z); 
(vii)     if  X,ZeS,  then  Z  :<  X  if  and  only  if  Z  =  X  or  else  X-Y  =  Z  for 

some  Y  E  S. 

(viii)     if  XeS,  then  there  is  a  Y  e  5  such  that:  (a)  X  +  Y  $  S,  (ft)  if 
ZeSandZ  <Y,  then  X  +  Z  e  S,  and  (7)  X-  U  +  Y-  U  =  U  for 
every  U  e  S. 
NOTE  2.7.     Theorem  2.6  can  be  extended  to  the  case  n  =  2. 

In  fact,  Lemmas  2.1(iii),  2.  l(vi),  and  2.4  are  the  only  ones  in  proofs  of 
which  three  dimensional  constructions  are  involved.  These  constructions 
should  now  be  replaced  by  two-dimensional  ones.  Unfortunately,  a  direct 
two-dimensional  proof  of  Lemma  2.4  is  still  lacking.5  We  know,  however, 
an  indirect  two-dimensional  proof  of  this  lemma ;  it  is  based  upon  one  of 
the  fundamental  results  of  Section  1,  namely  the  completeness  of  Jfg 
(see  Theorem  1 .2).  On  the  other  hand,  we  know  two  direct  two-dimensional 
arguments  which  lead  from  Lemma  2.4  to  Lemmas  2. 1  (iii)  and  2. 1  (vi) ,  res- 
pectively. As  opposed  to  the  three-dimensional  proofs  of  Lemmas  2.1  (iii) 
and  2. 1  (vi)  which  have  a  quite  elementary  character,  these  two-dimen- 
sional arguments  are  rather  involved  and  refer  to  deep  properties  of  the 
plane.  Lack  of  space  prevents  us  from  outlining  these  constructions. 

As  a  consequence  of  Theorem  2.6,  Note  2.7  and  the  elementary  conti- 
nuity axioms,  we  obtain  by  purely  algebraic  argument 

THEOREM  2.8.  For  every  model  9K  of  Jfn  (n^2)y  the  system 
g=<5)  +t  *f  :<>  can  be  imbedded  in  a  real  closed  field  %  =  <F,  +,  •,  ^> 
in  such  a  way  that  S  consists  of  all  those  elements  X  E  F  for  which  0  •<  X  •<  1 
(where  0  is  the  zero  element  and  1  is  the  unit  element  of  the  field^).  In  fact,  $ 
is  up  to  isomorphism  uniquely  determined  by  @. 

The  proof  of  this  theorem  is  easy,  though  lengthy  and  laborious. 
Postulate  (viii)  (of  Theorem  2.6)  plays  an  essential  role  in  showing  that  S 
is  the  set  of  all  elements  of  F  between  0  and  1 .  While  the  last  part  of  (viii) 

5  See  Footnote  6  on  page  51. 


ELEMENTARY  HYPERBOLIC  GEOMETRY  45 

is  a  particular  case  of  the  distributive  law,  (viii)  plays  also  an  essential 
role  in  the  derivation  of  this  law  in  its  general  form. 

From  now  on  we  assume  that  the  field  $  involved  in  Theorem  2.8  has 
been  fixed  and  we  apply  to  it  the  familiar  field-theoretical  notation.  In 
particular,  the  operations  +  and  •  are  now  understood  to  be  performable 
on  arbitrary  elements  of  the  field  and  not  only  on  free  segments. 

Theorem  2.8  essentially  completes  our  outline  of  the  calculus  of  free 
segments.  We  shall  need  however  a  few  further  lemmas  of  a  related 
character  before  we  turn,  in  Theorems  2.15  and  2.16,  to  the  metamathe- 
matical  discussion  of  systems  $Fn. 

LEMMA  2.9.     //  X,  Y,  Z  e  S,  then 
(i)  X  0  Y  =  Z    if  and  only  if    CX  •    CY  =  CZ; 
(ii)  X  $  Y  =  Z    if  and  only  if      X*  +  Y*  =  Z2; 
(iii)  CX  =  V\  -  X2. 

PROOF.  The  equivalence  (i)  is  an  immediate  consequence  of  the 
definition  of  •  and  Lemma  2.1(x). 

By  (i)  and  Lemma  2.  1  (viii)-(x)  we  get 
(2)  CRCX-CRCX  =  X. 

From  the  definition  of  +  and  formulas  (1)  (on  page  41)  and  (2)   we 
easily  derive  the  equivalence  (ii). 

The  formula  (iii)  follows  at  once  from  (ii)  and  Lemma  2.5. 

LEMMA  2.  10.  Let  pqrs  be  a  right  quadrangle  and  let  X  —  [pq],  Y  =  [qr], 
Z  =  [rs].  Then  we  have  X  =  CY-Z. 

PROOF.  Let  U  =  [qs]  (Figure  13).  Then,  in  agreement  with  the  defi- 
nitions of  $  and  0,  and  by  Lemma  2.9,  we  have 

U*  =  X*  +  Y2  and  1  —  £/2  =  (1  —  Y2)  •(!  —  Z2). 

Comparing  these  two  formulas  and  applying  Lemma  2.9  (iii)  we  obtain 
the  conclusion. 


LEMMA  2.  1  1  .     For  i=\,2,  let  ptftSi  be  right  triangles  and  let  Yi  = 
and  Zi  =  [piSi\.  Under  these  assumptions,  if  the  angles  at  the  vertices  pi 
and  pz  are  congruent,  then  Y\*Z%  —  Y^Z\. 


PROOF.     Assume  that  <£  rip\s\  ^  <£  r^p^s^  and  let  Pt  be  the  half-line 
(Figure  1  4)  .  The  triangle  PI^SI  determines  uniquely  a  point  qt  on  the 


46 


WANDA  SZMIELEW 


half-line  piSt  and  a  half-line  Qt  with  origin  qt  such  that  PiptqtQt  is  an 
improper  right  triangle.  Then  [piqi]  =  [pzqz].  Putting  [p\qi\  =  [#2^2]  =-X" 
and  applying  Lemma  2.4  we  get  the  formula 


which  completes  the  proof. 


Fig.  14 

LEMMA  2. 1 2.     Let  pqr  be  a  right  triangle,  let  s  be  the  perpendicular  pro- 
jections of  q  upon  the  straight  line  pr,  and  let  X  =  [/>$],  Y  =  [rs],  Z  = 
U  ~  [pq],  and  V  —  [qr].  Then  we  have: 

(i)  X  •  Z  =  U  •  U, 

(ii)  ^0Z=Y0C70t7 

(Figure  15). 


ELEMENTARY   HYPERBOLIC   GEOMETRY 


47 


PROOF.     Formula     (i)     follows    immediately    from     Lemma    2.  1  1  . 
Let  W  =  [qs].  Then 


and  consequently 


thus  we  arrive  at  formula  (ii). 


LEMMA  2.13.     Given  three  distinct  points  p,  s,  r  for  which  B(psr),  let 
X  =  [ps],  Y  =  [sr],  and  Z  =  [pr\.  We  then  have 


and 


CY 

CY  -*  — -- .- 


°  1  +  X-Y'- 


PROOF.  To  derive  (i)  we  take  a  point  q  in  such  a  way  that  pqr  is  a 
right  triangle  and  s  is  the  perpendicular  projection  of  q  upon  the  straight 
line  pr  (Figure  15).  Let  U  =  [pq].  Then  by  Lemma  2.12(i),  we  have 
X-Z  =  U*,  i.e., 

(3)  1  -  X-Z  =  (CC7)2, 

and,  by  Lemma  2. 12(11),  we  get  X  0  Z  =  Y  0  U  0  U,  which  by  Lemma 


48  WANDA  SZMIELEW 

2.9(i)  implies 

(4)  CX-CZ  =  CY-(CC7)a. 

From  (3)  and  (4)  we  obtain  at  once  the  desired  formula. 

From  (i)  and  the  inequality  X  <  Z  (which  obviously  follows  from  the 
hypothesis)  we  derive  (ii)  by  means  of  a  simple  algebraic  transformation. 

LEMMA  2.14.     Given  four  distinct  points  p,  q,  r,  s,  we  have 
(i)      B(Pqr)      if  and  only  if 


(ii)      D(pqrs)     if  and  only  if      C[pq]  =  C[rs]. 

PROOF.  Formula  (i)  follows  Lemma  2.13,  in  one  direction  directly, 
in  the  other  direction  by  a  simple  argument.  Formula  (ii)  is  obvious. 

The  metamathematical  discussion  begins  with  the  representation 
theorem. 

Let  55  =  <F,+,  •,<>  be  an  arbitrary  ordered  field.  By  the  n-di- 
mensional  Klein  space  Stw(3f)  over  the  field  $  we  understand  the  system 
O4^,  Bg,  D^y  constructed  in  the  following  way:  A%  is  the  set  of  all 
ordered  w-tuples  %  =  <#i,  #2,  .  .  .  ,  xny  in  F  X  F  x  ...  X  'F  (n  times)  for 
which 


For  any  ordered  w-tuples  x  =  <#i,  #2,  •  •  •  ,  %n>  and  y  =  <y\,  y<2,  •  •  • ,  yn>  in 

n 
i  =  l 

(thus  —  1  <  x-y  <  1)  and 

W(X) y)  =  0 -*•*)•(* -yy)  ^ 

We  always  have  Y(x,  y)  <,  1 .  The  betweenness  relation  B%  among  any 
three  n- tuples  %,  y,  z  in  A~  is  characterized  by  the  formula 


(1  +  Vl  —  W(x,y)'V\~— 
The  equidistance  relation  D%  among  any  four  n-tuples  x,  y,  z,  u  in  A%  is 


ELEMENTARY  HYPERBOLIC   GEOMETRY  49 

characterized  by  the  formula 

¥(x,  y)  =  W(z,  u). 

THEOREM  2.15.  (REPRESENTATION  THEOREM).  A  system^R  =  (A,B,Dy 
is  a  model  of  ^n  (n  ^>  2)  if  and  only  if  it  is  isomorphic  with  the  Klein 
space  Sw(2f)  =  O4g,  B%,  D%>  over  some  real  closed  field  gf. 


PROOF.  It  is  well  known  that  the  Klein  space  S8n(9t)  over  the  ordered 
field  9ft  of  real  numbers  is  a  model  for  3#Jn.  Hence,  by  the  result  of  Tarski 
used  in  the  proof  of  Theorem  1.1,  the  same  applies  to  all  the  spaces 
®n(W  where  g  is  a  real  closed  field,  as  well  as  to  all  isomorphic  systems. 

To  prove  the  theorem  in  the  opposite  direction  consider  an  arbitrary 
model  9K  =  <X,  B,  £>>  of  tfn  and  the  correlated  system  <3=<S,  +,  •,  <>. 
By  Theorem  2.12,  the  system  @  can  be  imbedded  in  a  real  closed  field 
%  —  <F,  +  ,  •,  ^>,  and  we  can  construct  the  corresponding  Klein  model 
JJn(Qf)  =  <4cj,  B$,  Z)s>  over  the  field  $.  We  introduce  in  this  model  a 
rectangular  coordinate  system  (each  of  the  n  coordinates  of  a  point  p 
being  of  the  form  ±  U  where  U  E  S).  It  is  easy  to  check,  that  by  corre- 
lating with  every  point  p  of  A  the  ordered  n-tupleXp=<XpL,  X%,  .  ..,Xpy 
of  its  coordinates,  we  establish  a  1-1  correspondence  between  the 
points  of  A  and  the'  points  of  A%.  (See  the  definitions  of  A%  and  $, 
Theorem  2.8  and  Lemma  2.9  (ii).)  It  remains  to  be  shown  that  this  corre- 
spondence establishes  an  isomorphism  between  9ft  and  ftnffi)-  This 
reduces  to  showing  that  the  relations  B  and  D  among  points  of  A  can  be 
characterized  in  terms  of  the  coordinates  of  these  points  in  exactly  the 
same  way  in  which  the  relations  B%  and  D<$  among  points  of  A^  have 
been  defined  in  the  Klein  model  $n(3f)- 

Consider  two  distinct  points  p  and  q  in  A  and  the  correlated  n-tuples 
of  coordinates  Xp  and  Xq.  We  first  express  the  free  segment  [pq]  in  terms 
of  Xp  and  Xq.  An  easy  but  lengthy  calculation,  based  exclusively  upon 
Lemmas  2.9(i),  2.10,  and  2.13,  leads  to 

(5)  (C[pq\)*  =  V(X*,  X*}, 


where  If  is  the  function  used  in  describing  the  Klein  space.  (The  argument 
is  analogous  to  that  used  in  Euclidean  case,  with  the  difference  that 
rectangles  are  replaced  by  right  quadrangles.)  From  (5),  Lemma  2.9(iii), 
and  Lemma  2.14  (i)  we  conclude  at  once  that  the  condition 


Vi  -  v(xp,  x«)  •  Vi  - 


50  WANDA  SZMIELEW 

is  necessary  and  sufficient  for  points  p,  q,  r  to  satisfy  the  formula  B(pqr). 
Similarly,  from  (5)  and  Lemma  2.14  (ii)  we  conclude  that  the  condition 

lf/(Xpf  X*)  =  W(Xr,  Xs) 

is  necessary  and  sufficient  for  points  p,  q,  r,  s  to  satisfy  the  formula 
D(pqrs).  Thus  the  proof  is  completed. 

Using  Theorem  2.15  instead  of  1.1  we  obtain  of  course  a  new  proof  of 
Theorem  1 .2  and,  actually,  we  can  extend  this  result  to  arbitrary  dimen- 
sion n: 

THEOREM  2.16.  The  theory  J^n  (n  >  2)  is  complete  and  decidable  but 
not  finitely  axiomatizable. 

3.  Applications  to  Related  Geometrical  Systems.  Using  the  main  results 
stated  in  [5]  for  Euclidean  geometry  and  in  this  paper  for  hyperbolic 
geometry  we  shall  now  establish  fundamental  metamathematical  proper- 
ties of  elementary  absolute  geometry.  The  discussion  in  [5]  has  been 
restricted  to  the  two-dimensional  case  only  for  simplicity  of  formulation 
and  the  results  established  there  clearly  extend  to  elementary  n-di- 
mensional  Euclidean  geometry  $n  for  any  n  ^  2. 

Let  the  formalized  system  stfn  of  n-dimensional  absolute  geometry  be  a 
theory  which  has  the  same  symbolism  as  & n  and  2^n  and  the  axiom 
system  of  which  is  obtained  by  omitting  Euclid's  axiom,  A8,  in  the  axiom 
system  of  $n  (or  the  negation  of  A8  in  the  axiom  system  of  <#*n).  Thus  a 
sentence  is  valid  in  <z/n  if  and  only  if  it  is  valid  in  both  £n  and  Jf' n.  As 
simple  consequence  of  Theorem  1  in  [5]  and  Theorem  2.15  in  the  present 
paper  we  obtain. 

THEOREM  3.1.  $31  is  a  model  of  stf  n  (n  ;>  2)  if  and  only  if  it  is  isomorphic 
either  with  the  Cartesian  space  (£»(£?)  or  with  the  Klein  space  ®n(%)  over 
some  real  closed  field  $. 

Theorem  3.1  contains  a  description  of  all  models  of  <stfn  which  is 
however  not  uniform  in  its  character;  the  class  of  models  proves  to 
consist  of  two  widely  different  subclasses.  It  would  be  interesting  to 
obtain  a  more  homogeneous  characterization  of  this  class. 

Theorems  2,  3,  and  4  in  [5]  and  Theorem  2.16  in  the  present  paper 
imply  the  following  theorems  3.2-3.4  as  direct  corollaries. 

THEOREM  3.2.  The  theory  s/n  (n  ^  2)  has  just  two  complete  and  con- 
sistent extensions,  in  fact,  $n  and  3tfn. 


ELEMENTARY  HYPERBOLIC  GEOMETRY  51 

A  consequence  of  Theorem  3.2.  is  that  Euclid's  axiom  can  be  equiva- 
lently  replaced  in  the  axiom  system  of  $n  by  any  sentence  whatsoever 
which  is  valid  in  $n  but  not  in  Wn\  the  same  of  course  applies  to  the 
negation  of  the  Euclid's  axiom  in  the  axiom  system  of  J? n. 

THEOREM  3.3.     The  theory  30  n  (n  ^  2)  is  decidable. 

This  theorem  is  an  improvement  of  Tarski's  decision  theorem  for  $n. 

THEOREM  3.4.     The  theory  $4 n  (n  :>  2)  is  not  finitely  axiomatizable. 

In  conclusion  we  wish  to  make  some  remarks  concerning  the  system 
,W  n  of  non-elementary  n-dimensional  hyperbolic  geometry.  The  main 
difference  between  the  symbolisms  of  J#'n  and  ,#'n  consists  primarily 
in  the  fact  that  all  the  variables  occuring  in  the  former  range  over  points, 
while  the  latter  contains  also  variables  ranging  over  arbitrary  point  sets. 
(The  question  whether  ^n  contains  in  addition  variables  of  higher 
orders  ranging  over  families  of  sets,  etc.  is  irrelevant  for  the  subsequent 
remarks.)  The  axiom  system  of  M  n  is  obtained  from  that  of  J^n  by  re- 
placing the  infinite  collection  of  elementary  continuity  axioms  by  one 
non-elementary  axiom  (see  [5],  p.  18).  In  every  model  9ft  of  f^n  the 
ordered  field  $  in  which  the  system  @  can  be  imbedded  (see  Theorem  2.8) 
proves  to  be  continuously  ordered.  Since  a  continuously  ordered  field  $ 
is  isomorphic  with  the  field  91  of  real  numbers,  the  correlated  Klein  space 
Sl»(3f)  is  isomorphic  with  the  Klein  space  Slw(9i).  Thus,  by  Theorem  2.15, 
we  conclude  that  every  model  9ft  of  ^'n  is  isomorphic  with  the  Klein 
model  fttt($R).  In  this  way  we  arrive  at 

THEOREM  2.18.     The  theory  J?  n  is  categorical. 

This  result  is  well  known  but  all  other  proofs  which  are  known  to  the 
author  are  based  upon  an  analytic  formula  for  H(X)  (see  page  34)  and 
hence  upon  some  properties  of  exponential  and  trigonometric  functions.  6 


6  While  the  paper  was  in  press  the  author  noticed  that  a  direct  two-dimensional 
proof  of  Lemma  2.4  (cf.  Note  2.7  on  p.  44)  results  at  once  from  a  theorem  due  to 
Liebmann  in  [6],  p.  191. 

Moreover,  the  author  succeeded  in  constructing  in  ja/n  (n  ;>  2)  an  absolute 
calculus  of  segments.  This  calculus  leads  to  the  representation  theorems  for  both 
Euclidean  and  Bolyai-Lobachevskian  geometries. 


52  WANDA  SZMIELEW 

Bibliography 

[1]    HILBERT,  D.  Grundlagen  der  Geometric.  8th  ed.,  Stuttgart  1956. 

[2]    HJELMSLEV,  J.  Beitrdge  zurNicht-EudoxischenGeometrie  I-II.  Det.  Kgl.  Danske 

Videnskaberners  Selskab,  Matematisk-Fysiske  Meddelelser,  vol.  21  (1944),  Nr. 

5. 
[3]     SzAsz,   P.  Direct  introduction  of  Weierstrass  homogeneous  coordinates  in  the 

hyperbolic  plane,  on  the  basis  of  the  endcalculus  of  Hilbert.  This  volume,  pp.  97- 

113. 
[4]    TARSKI,  A.  A  decision  method  for  elementary  algebra  and  geometry.  2nd  ed., 

Berkeley  and  Los  Angeles  1 95 1 . 

[5]    What  is  elementary  geometry?  This  volume,  pp.  16-29. 

[6]    LIEBMANN,  H.,  Elementargeometrischer  Beweis  der  Parallelenkonstruktion  und 

neue  Begriindung  der  trigonometrischen  Formeln  der  hyperbolischen  Geometrie, 

Mathematische  Annalen,  vol.  61  (1905),  pp.  185-199. 


Symposium  on  the  Axiomatic  Method 


DIMENSION  IN  ELEMENTARY  EUCLIDEAN  GEOMETRY  1 

DANA  SCOTT 

Princeton   University,  Princeton,  New  Jersey,   U.S.A. 

Introduction.  It  has  been  well  over  one  hundred  years  since  higher 
dimensional  geometry  made  its  appearance  in  mathematics  and  at  least 
fifty  years  since  the  terminology  of  infinite  dimensional  spaces  came  into 
general  use.  No  one  can  deny  the  enrichment  of  the  subject  brought 
about  by  the  introduction  of  these  notions,  but  even  though  the  infinite 
dimensional  spaces  would  seem  to  be  a  direct  generalization  of  the  finite 
dimensional  spaces,  it  is  clear  that  their  importance  in  mathematics  really 
lies  in  a  different  direction.  In  finite  dimensions  we  are  concerned  with 
ever  more  complicated  configurations  of  points,  lines,  planes,  spheres,  or 
other  algebraic  varieties,  and  in  this  study  a  knowledge  of  facts  in  higher 
dimensions  often  leads  to  a  better  understanding  of  the  lower  dimensions. 
Of  course,  all  such  configurations  are  possible  in  an  infinite  dimensional 
space,  but  in  the  study  of  any  one  particular  problem  so  little  of  the  space 
is  used  that  one  might  as  well  work  in  only  a  finite  number  of  dimensions. 
Thus,  the  question  arises  whether  there  is  really  anything  new  in  infinite 
dimensional  geometry.  The  applications  of  infinite  dimensional  geometry 
to  analysis  and  the  study  of  function  spaces  are  something  new  and 
beyond  the  finite  dimensional  theory,  but  this  is  not  what  is  meant.  From 
the  standpoint  of  pure  geometry,  is  there  anything  new?  In  particular, 
are  there  different  kinds  of  infinite  dimensional  spaces  ?  Of  course,  anyone 
can  think  of  two  distinct  Hilbert  spaces,  for  example,  but  is  there  any 
geometrical  property  that  distinguishes  them?  Cardinality  is  a  property 
that  will  often  distinguish  between  two  infinite  dimensional  spaces; 
however,  the  cardinal  number  of  a  set  is  not  really  associated  with  the 
internal  structure  of  the  space  in  isolation  but  only  becomes  meaningful 
in  comparisons  with  other  sets.  Thus,  in  the  study  of  geometrical  proper- 
ties of  spaces  we  wish  to  restrict  attention  to  those  constructions  that  can 
be  carried  out  within  the  space  itself  making  use  of  only  the  given  geo- 
metrical notions.  This  point  of  view  seems  even  to  throw  doubt  on 

1  The  results  of  this  paper  represent  a  portion  of  a  thesis  submitted  to  the 
Faculty  of  Princeton  University  in  partial  fulfillment  of  the  requirements  for  the 
degree  of  Doctor  of  Philosophy. 

53 


54  DANA  SCOTT 

topological  questions.  The  most  useful  facts  of  point-set  topology  nearly 
always  rest  on  operations  performed  on  arbitrary  subsets  of  the  space,  and 
since  the  time  of  Cantor  we  have  realized  how  vastly  complicated  these 
subsets  may  become.  Indeed,  point-set  topology  with  its  heavy  use  of 
infinite  combinations  and  infinite  repetitions  of  operations,  though 
derived  from  geometrical  intuition,  is  a  totally  new  discipline  that  has 
moved  far  from  the  special  world  of  Euclidean  geometry.  The  same  may 
be  said  for  other  questions  of  the  analysis  of  Hilbert  spaces  such  as 
completeness,  existence  of  orthonormal  bases,  and  the  like.  Thus,  we  may 
be  led  to  the  conclusion  that  the  usual  geometrical  notions  do  not 
involve  infinite  sets  or  infinite  sequences  of  points,  and  this  will  be  the 
convention  adapted  in  the  present  paper.  If  the  reader  does  not  entirely 
agree  with  this  point  of  view,  at  least  it  is  hoped  that  he  will  agree  that 
elementary  geometrical  notions  do  not  involve  infinite  sets  and  that  he 
admits  that  even  the  infinite  dimensional  spaces  contain  the  material  for 
many  elementary  constructions,  so  that  there  is  a  meaningful  question 
whether  infinite  dimensional  spaces  can  be  distinguished  by  their  ele- 
mentary properties. 

First  of  all  it  must  be  said  what  Euclidean  spaces,  finite  or  infinite 
dimensional,  actually  are.  The  definition  chosen  in  Section  1  is  the 
standard  one  making  use  of  vector  spaces.  Geometrical  properties 
of  spaces  must  be  formulated  in  terms  of  geometrically  meaningful 
notions.  In  the  case  of  elementary  properties  there  is  no  loss  of  gener- 
ality in  considering  only  finitary  relations  between  points,  and  in  this 
context  the  term  geometrically  meaningful  relation  or  simply  geometri- 
cal relation  is  given  a  precise  definition.  Finally  elementary  geometrical 
properties  are  identified  with  those  properties  of  a  space  expressible  in 
sentences  of  the  first-order  predicate  logic  in  terms  of  the  geometrical 
relations  over  the  space.  Before  giving  the  specifically  geometric  results, 
a  general  theorem  in  the  theory  of  models  of  the  first-order  logic  is 
presented  in  Section  2.  The  general  result  is  then  applied  in  a  straight 
forward  way  to  geometry  in  Section  3,  and  it  is  shown  that  there  are  no 
elementary  geometrical  properties  distinguishing  any  two  infinite  dimensional 
Euclidean  spaces.  In  particular,  for  any  given  formal  property,  a  very 
simple  method  is  given  for  calculating  a  finite  dimension,  m  say,  such 
that  the  property  is  true  in  spaces  of  dimension  m  if  and  only  if  it  is  true 
in  all  higher  dimensions  including  all  the  infinite  dimensions.  The  con- 
sequences of  this  state  of  affairs  for  a  certain  formal  theory  of  geometry 
are  indicated  in  Section  4. 


DIMENSION  IN  ELEMENTARY  EUCLIDEAN  GEOMETRY  55 

The  author  would  like  to  thank  Professor  Tarski,  who  originally 
proposed  the  problem  and  who  made  many  helpful  comments  on  the 
formulation  of  the  results. 

1.  Euclidean  Spaces  and  Geometrical  Relations.  Before  discussing  any 
formal  theory,  it  is  necessary  to  determine  the  standard  domains  of 
discourse  to  which  the  theory  will  be  applied.  As  regards  geometry,  if  we 
were  concerned  only  with  finite  dimensions,  we  could  think  simply  of  the 
ordinary  w-dimensional  cartesian  spaces  whose  points  are  w-tuples  of  real 
numbers.  But  these  are  not  sufficient  for  our  purposes.  In  any  case,  there 
is  no  need  to  think  of  a  particular  coordinate  system  as  in  the  cartesian 
spaces,  because  a  distinguished  coordinate  system  is  not  a  purely  geo- 
metrical notion.  A  definition  by  vector  space  methods  solves  the  problem 
and  eliminates  any  distinguished  set  of  coordinates.  In  the  first  place,  a 
(standard)  Euclidean  space  will  be  a  vector  space  over  the  field  of  real 
numbers  having  any  finite  or  infinite  linear  dimension.  In  addition,  to 
give  the  essential  Euclidean  character  to  the  space,  a  notion  sufficient  for 
questions  of  distance  and  perpendicularity  has  to  be  supplied.  A  positive 
definite  inner  product  on  the  space  will  do  just  that.  To  sum  up,  ^Euclidean 
space  is  a  4-tuple  <K,  +  ,-,•>,  where  V  is  a  set  of  elements  called  the 
points  of  the  space ;  <F,  +  >  is  an  abelian  group ;  •  is  an  operation  from  the 
cartesian  product  of  the  real  numbers  with  V  to  the  set  V  satisfying  the 
following  properties  for  all  reals  a,  /?  and  all  x,  y  e  V: 

(i)   !•*  =  *; 
(ii)  *-(fl-x)  =  («£)•*; 
(iii)  («  +  £)•*  =  («•*)  +  (£•*); 

(iv)    a-(*  +  y)  =  («•*)  +  (a-y); 

and  finally  •  is  an  operation  from  pairs  of  elements  of  V  to  real  numbers 
such  that  for  all  reals  a,  ft  and  all  x,  y ,  z  e  V : 

(v)  Ifx^Q,  thenx-x  >  0; 
(vi)  x*y  —  yx', 
(vii)  ((a-*)  +  (p-y))-z  =  *(x-z)  +  0(y-*); 

where  the  symbol  0  denotes  in  the  hypothesis  of  (v)  the  zero  element  of 
the  group. 

German  capital  letters  35  and  28  will  be  used  to  denote  Euclidean 


56  DANA  SCOTT 

spaces,  and  the  corresponding  set  of  points  will  be  denoted  by  Roman 
capitals  V  and  W. 

In  a  Euclidean  space  the  distance  between  points  x  and  y,  in  symbols 
\\x  —  y\\,  can  be  introduced  by  definition: 


where  (x  —  y)  on  the  right  hand  side  means  (x  +  ((—  1)  *y)). 

The  above  treatment  of  Euclidean  spaces,  though  it  does  not  involve 
a  choice  of  a  particular  coordinate  system,  does  involve  using  a  dis- 
tinguished point:  the  origin  or  zero  vector  0.  The  dependence  on  0  is 
eliminated  in  our  definitions  of  the  notions  of  subspace  and  isometry. 
A  subspace  of  a  Euclidean  space  33  is  a  non-empty  subset  X  of  V  such 
that  whenever  x  and  y  are  points  in  X,  then  OL-X  +  (1  —  a)  -y  is  in  X  for 
all  real  numbers  a.  In  other  words,  if  a  subspace  contains  two  points  of  a 
line,  then  it  must  contain  all  points  of  the  line.  An  isometry  from  a  sub- 
space  X  of  a  Euclidean  space  33  onto  a  subspace  Y  of  a  Euclidean  space  28 
is  a  one-one  function  /  from  X  onto  Y  preserving  the  distance  between 
points;  that  is,  if  x  and  y  are  in  X,  then  \\x  —  y\\  =  \\f(x)  —  f(y)\\,  where 
the  distance  on  the  left  hand  side  of  the  formula  refers  to  the  space  95, 
and,  on  the  right  hand  side,  to  the  space  955. 

In  this  terminology,  subspaces  of  a  Euclidean  space  are  not  again 
Euclidean  spaces  since  they  are  not  necessarily  vector  subspaces.  If  a 
subspace  contains  the  zero  vector,  then  it  is  a  vector  subspace.  It  is 
thus  obvious  that  a  translation  of  the  space  will  always  carry  any  sub- 
space  onto  a  vector  subspace,  and  hence  we  can  say  that  every  subspace 
of  a  Euclidean  space  is  isometric  with  a  Euclidean  space.  Clearly,  iso- 
metries  between  Euclidean  spaces  always  preserve  dimension,  and  so 
every  subspace  of  a  Euclidean  space  has  an  unambiguous  dimension. 

Having  thus  defined  the  dimension  of  a  subspace,  a  simple  property 
of  subspaces  that  is  needed  in  the  later  work  can  be  stated : 

LEMMA  1.1.  Every  set  of  m  +  1  points  of  a  Euclidean  space  is  con- 
tained in  a  subspace  of  dimension  at  most  m. 

Somewhat  more  complicated  but  very  easy  to  prove  is  the  following : 

LEMMA  1.2.  Let  X  and  Y  be  two  subspaces  of  a  Euclidean  space  33 
having  the  same  finite  dimension.  Then  there  is  an  isometry  of  33  onto  itself, 
mapping  X  onto  Y,  and  leaving  the  intersection  X  r»  Y  pointwise  fixed. 


DIMENSION  IN  ELEMENTARY  EUCLIDEAN  GEOMETRY  57 

The  question  we  turn  to  next  is  that  of  defining  the  concept  of  a  geo- 
metrically meaningful  relation  between  points.  There  are  two  different 
aspects  to  the  question.  First,  we  can  consider  a  fixed  Euclidean  space  and 
relations  between  the  points  of  that  one  space.  Second,  the  class  of  all 
Euclidean  spaces  can  be  considered,  and  relations  in  different  spaces  can 
be  compared.  For  the  first  problem  the  answer  is  very  simple:  In  a 
Euclidean  space  33,  the  geometrical  relations  over  33  are  just  those  re- 
lations between  points  of  V  invariant  under  the  group  of  all  isometrics  of 
33  onto  itself.  In  other  words,  an  n-ary  relation  R  over  V,  or  a  subset  R  of 
the  cartesian  power  Vn,  is  a  geometrical  relation  if  for  all  isometrics  /  of  33 
onto  itself  and  all  n-tuples  <#o,  •  -  -  >  x>n-\>  £  Vn  we  have  <#o»  •  •  • ,  #n-i>^R 
if  and  only  if  </(#o),  . . .,  f(xn-i)>  e  R.  The  above  definition,  of  course, 
agrees  with  the  well-known  program  of  Klein  which  asserts  that  the 
group  of  motions  of  the  space  should  determine  the  geometry. 

When  we  pass  over  to  the  class  of  all  Euclidean  spaces  some  care  must 
be  taken.  There  are  indeed  some  set-theoretical  problems  connected  with 
the  idea  of  the  class  of  all  spaces.  These  problems  do  not  cause  any 
essential  difficulties  and  can  all  be  solved  by  adopting  a  standard  type  of 
formal  set-theoretical  framework.  Rather  more  important  here  is  the 
question  of  comparison  of  different  spaces.  A  geometrical  relation  over 
the  class  of  all  Euclidean  spaces  should  be  an  assignment  of  one  geometri- 
cal relation  to  each  particular  space.  Clearly  isometric  spaces  should  get 
isometric  relations;  but  more  than  this,  the  assignment  should  be  in- 
sensitive to  dimension.  It  would  be  hopeless  to  try  to  classify  all  ways  of 
assigning  one  kind  of  relation  to  one-dimensional  spaces,  another  kind  to 
two-dimensional  spaces,  a  third  to  three  dimensions  and  so  on.  So  we  are 
lead  to  the  restriction  that  a  geometrical  relation  over  all  spaces  is  to  be 
invariant  under  isometrics  not  only  from  one  space  onto  another,  but  also 
from  one  space  into  another.  In  more  formal  terms:  an  n-ary  geometrical 
relation  over  the  class  of  all  Euclidean  spaces  is  a  function  R  that  assigns 
to  each  space  33  a  subset  R<%  of  Vn  such  that  if  /  is  an  isometry  of  33  into 
a  space  28,  then  for  all  w-tuples  <#o,  •  • .,  #n-i>  e  Vn  we  have  <#o,  •  •  •, 
Xn-iyeR^  if  and  only  if  </(#o),  •  •  >>  f(xn-i)>  eT?^.  The  effect  of  the 
above  definition  is  to  assure  that  if  X  is  a  subspace  of  a  Euclidean  space 
33,  then  the  relation  R^  r\  Xn,  the  restriction  to  X,  is  isometric  to  the 
relation  obtained  by  considering  X  a  Euclidean  space  in  itself. 

There  are  many  examples  of  geometrical  relations.  The  first  interesting 
case  is  that  of  binary  relations.  It  can  be  shown  that  there  are  22**°  geo- 
metrical binary  relations;  in  fact,  they  can  all  be  obtained  in  the  following 


58  DANA  SCOTT 

way:  Let  D  be  any  set  of  non-negative  real  numbers.  For  each  space  SJ, 
let  the  relation  R^  be  defined  by  the  condition  <X  y>  e  R^  if  and  only  if 
||#  —  y||  e  D,  for  all  x,  y  e  V.  Then  7?  is  a  geometrical  relation.  For 
ternary  relations  we  need  mention  only  a  few:  betweenness,  being  the 
midpoint,  collinearity,  forming  an  equilateral  triangle,  being  equidistant 
from  two  points,  and  so  on.  A  similar  description  of  all  such  relations  can 
easily  be  given  in  terms  of  sets  of  triples  of  real  numbers. 

Finally  it  is  to  be  noted  that  the  definition  can  be  extended  to  cover 
geometrical  relations  between  lines,  planes,  spheres,  and  the  like,  but  this 
is  hardly  ever  necessary  and  will  not  be  considered  here.  In  view  of  the 
fact  that  the  various  algebraic  loci  are  completely  determined  by  a  finite 
number  of  points  lying  on  them,  any  relation  between  such  objects  can  be 
encoded  into  an  equally  powerful  relation  between  points.  For  example,  a 
binary  relation  between  lines  can  always  be  replaced  by  a  quaternary 
relation  between  points.  Also  there  is  no  need  to  consider  more  than  one 
geometrical  relation  between  points,  since  two  relations,  one  n-ary  and 
one  w-ary  say,  can  always  be  replaced  by  a  single  (n  +  w)-ary  relation  in 
an  obvious  way. 

2.  Arithmetical  Extensions  of  Finite  Degree.  All  terminology  of  the 
paper  of  Tarski  and  Vaught  [2]  will  be  adopted  for  the  purposes  of  this 
section,  except  that  relational  systems  (A,  Ry  are  considered  where  R  is 
n-ary  relation,  or  relation  of  rank  n,  rather  than  just  a  ternary  relation. 
The  integer  n,  however,  is  to  be  fixed  for  the  discussion.  The  formal 
theory  T  in  the  first-order  predicate  logic  must  then  contain  an  n-placed 
predicate  symbol  P,  as  well  as  the  standard  logical  symbols. 

In  particular,  we  are  interested  in  the  specific  algebraic  condition  given 
by  Tarski  and  Vaught  [2]  in  Theorem  3.1  for  one  relational  system  to  be 
an  arithmetical  extension  of  another.  The  general  notion  of  arithmetical 
extension  defined  in  that  paper  concerns  all  possible  formulas  of  the  first- 
order  logic,  and  for  the  purposes  of  this  paper  a  weaker  notion  involving 
only  a  restricted  class  of  formulas  is  needed.  The  formal  definition 
follows. 

DEFINITION  2.1.  The  system  @  =  <B,  S>  is  called  an  w-degrce  arith- 
metical extension  of  the  system  9t  —  O4,  Ry  if  the  following  two  conditions 
are  satisfied: 

(i)    @  is  an  extension  of  91; 


DIMENSION  IN  ELEMENTARY  EUCLIDEAN  GEOMETRY  59 

(ii)  for  every  formula  $  containing  at  most  m  distinct  variables  and  every 
sequence  x  e  A  <">,  x  satisfies  <f>  in  9ft  if  and  only  if  x  satisfies  <f>  in  @. 

It  should  be  noted  that  there  is  no  loss  of  generality  in  condition  (ii)  if 
the  formula  <f>  is  required  to  contain  only  variables  from  the  specific  list 
VQ,  vi,  .  .  . ,  vm~i,  and  then  the  sequence  x  can  be  chosen  simply  from  the 
set  Am. 

A  generalization  of  Theorems  3. 1  of  [2]  can  now  be  stated  and  proved. 

THEOREM  2.2.  The  following  two  conditions  are  (jointly)  sufficient  for  a 
system  ©  =  <#,  S>  to  be  an  m-degree  arithmetical  extension  of  a  system 
JR=  <A,R>: 

(i)    @  is  an  extension  of  9ft; 

(ii)  for  any  subset  A'  of  A  with  less  than  m  elements  and  any  element  b  of  B, 
there  exists  an  automorphism  f  of  &  such  that  f  leaves  A'  pointwise  fixed 
andf(b)eA. 

PROOF.  Using  the  remark  following  Definition  2.1  we  restrict  attention 
to  formulas  involving  at  most  the  variables  VQ,  v\,  . . .,  vm-\  and  proceed 
by  induction  on  the  length  of  formulas.  Assuming  conditions  (i)  and  (ii) 
above,  the  following  is  the  statement  to  be  proved  for  formulas  <f>  of  the 
restricted  type: 

(*)    for  all  x  G  Am,  x  satisfies  </>  in  9i  if  and  only  if  x  satisfies  cj>  in  @. 

The  statement  (*)  is  obviously  true  for  atomic  formulas,  and  it  is  very 
easy  to  show  that  if  (*)  holds  for  formulas  <f>  and  y  then  it  holds  for  -.  <f> 
and  (f>  A  yj.  Suppose  now  that  (*)  holds  for  (f),  and  consider  the  formula 
V  v#f>,  where  k  <  m.  Assume  first  that  x  e  Am  and  x  satisfies  V  v&/>  in  9t. 
Then  for  some  element  a  e  A,  x(k/a)  satisfies  <f>  in  9t.  By  the  hypothesis 
x(k/a)  satisfies  <j>  in  @,  and  hence  x  satisfies  V  vj^  in  @,  as  was  to  be  shown. 
Assume  now  that  x  e  Am  and  x  satisfies  V  vj^  in  ©.  Let  b  e  B  such  that 
x(k/b)  satisfies  <f>  in  @.  The  set  A1  =  {xt\i  <  m,  i  ^  k}  is  a  subset  of  A 
with  fewer  than  m  elements.  In  view  of  condition  (ii),  let  /  be  an  auto- 
morphism of  @  leaving  A '  pointwise  fixed  and  with  f(b)  E  A .  Since  /  is  an 
automorphism  of  <3,  the  sequence  </(#o),  • .  . ,  /(#*-i),  f(b),  .  . . ,  f(xm-i)>  = 
x(k/f(b))  must  satisfy  </>  in  @.  Hence,  from  (*)  for  <f>,  we  have  that  x(k/f(b)) 
satisfies  <f>  in  9ft,  and  finally  x  satisfies  V  v^  in  9ft,  which  completes  the 
proof  that  (*)  holds  for  V  v#f>.  Thus,  by  induction  (*)  is  established  for  all 
formulas  and  the  theorem  is  proved. 


60  DANA  SCOTT 

In  the  original  version  of  this  paper,  the  author  proved  a  somewhat 
different  form  of  Theorem  2.2  which  does  not  require  the  existence  of 
automorphisms  of  the  system  @  but  rather  uses  a  whole  class  of  iso- 
morphic  subsystems  35  of  ©  whose  union  covers  the  set  B.  However, 
Euclidean  spaces  possesses  so  many  isometrics,  as  was  noted  in  Lemma 
1 .2  of  Section  1 ,  that  the  simpler  theorem  just  given  is  quite  adequate  for 
the  results  of  the  next  section.  The  other  version  of  the  general  algebraic 
condition  for  w-degree  extensions  and  its  applications  will  be  published 
elsewhere.  Notice  that  Theorem  2.2  implies  Theorem  3.1  of  Tarski- 
Vaught  [2],  since  being  an  arithmetical  extension  is  equivalent  to  being 
an  w-degree  extension  for  each  m,  and  the  condition  (ii)  of  their  Theorem 
3. 1  obviously  implies  conditions  (ii)  of  Theorem  2. 1  above. 

3.  Relational  Systems  Derived  from  Euclidean  Spaces.  Let   S3  be  a 

Euclidean  space  and  let  R  be  an  n-ary  geometrical  relation  over  93.  The 
system  <F,  Ry  is  a  relational  system  and  any  subspace  X  of  93  yields  a 
corresponding  subsystem  <X,  R  r»  Xny.  Our  first  theorem  shows  the 
relation  of  the  theory  of  first-order  sentences  true  of  R  in  the  whole  space  93 
to  those  true  in  the  subspace  X. 

THEOREM  3. 1 .  If  R  is  an  n-ary  geometrical  relation  over  the  Euclidean 
space  SJ  and  X  is  a  subspace  of  93  of  dimension  at  least  m,  then  the  rela- 
tional system  <K,  Ry  is  an  (m  +  \) -degree  arithmetical  extension  of 
<X,  R  o  X»>. 

PROOF.  We  need  only  verify  condition  (ii)  of  Theorem  2.2.  Let  X'  be 
a  subset  of  X  with  at  most  m  elements.  Since  we  may  obviously  assume 
m  >  0,  X'  can  be  contained  in  a  subspace  Y  of  dimension  m  —  1  which  is 
also  contained  in  X.  Let  YO  be  a  subspace  of  dimension  exactly  m  con- 
taining Y  and  contained  in  X.  Let  b  be  any  point  in  V  not  in  X.  Obviously 
we  can  find  a  subspace  YI  of  dimension  exactly  m  containing  Y  and  con- 
taining b.  Since  YQ  is  included  in  X  and  b  is  not,  YO  ^  YI  =  Y.  Using 
Lemma  1 .2,  let  /  be  an  isometry  of  93  onto  itself  taking  YI  onto  YO  and 
leaving  Y  pointwise  fixed.  The  function  /  will  thus  be  an  automorphism  of 
<F,  Ry  such  that  f(b)  e  X,  which  completes  the  proof. 

COROLLARY  3.2.  //  R  is  an  n-ary  geometrical  relation  over  the  Euclidean 
space  93  and  X  is  an  infinite  dimensional  subspace  of  93,  then  the  relational 
system  <F,  Ry  is  an  arithmetical  extension  of  (X,  R  r»  Xny. 


DIMENSION  IN  ELEMENTARY  EUCLIDEAN  GEOMETRY  61 

COROLLARY  3.3.  //  R  is  an  n-ary  geometrical  relation  over  the  Euclidean 
space  95  and  X  and  Y  are  two  infinite  dimensional  sub  spaces  of  95,  then  the 
relational  systems  <X,  R  r\  Xny  and  <Y,  R  r»  Yw>  are  arithmetically 
equivalent. 

In  less  formal  terms  the  above  results  can  be  explained  in  the  following 
way.  Let  95  be  an  Euclidean  space  and  R  be  a  geometrical  relation.  Consider 
a  sentence  <j>  in  the  formal  first-order  theory  of  a  predicate  that  is  to  be 
interpreted  as  the  relation  R.  We  ask  whether  <f>  expresses  a  true  property 
of  R.  Now  </>  contains  only  finitely  many  symbols  and  in  particular  only 
a  finite  number  of  variables,  m  +  1  say.  Theorem  3. 1  shows  us  that  the 
truth  of  <j>  can  be  established  by  looking  not  at  the  whole  space,  but  only 
at  w-dimensional  subspaces  of  95.  If  95  were  already  of  a  dimension 
smaller  than  m,  this  result  is  not  of  much  help.  However,  if  95  has  a  very 
large  dimension  or  is  infinite  dimensional,  then  the  reduction  is  consider- 
able. In  particular,  Corollary  3.3  shows  that  no  single  first-order  property 
or  even  a  set  of  first-order  properties  of  the  geometrical  relation  R  can 
ever  distinguish  between  two  infinite  dimensional  subspaces  of  95. 

We  turn  now  from  one  space  to  the  class  of  all  spaces.  Here  we  need  to 
consider  geometrical  relations  over  the  whole  class  of  spaces  as  defined 
in  Section  1 .  As  a  direct  consequence  of  Theorem  3. 1  we  obtain : 

THEOREM  3.4.  //  R  is  a  geometrical  relation  over  the  class  of  all  Euclidean 
spaces  and  <f>  is  a  sentence  of  first-order  logic  with  (m  +  1 )  distinct  variables, 
then  <f>  is  true  in  all  relational  systems  <F,  R<^>,  where  95  is  a  Euclidean 
space  of  dimension  at  least  m,  if  and  only  if  <f>  is  true  in  at  least  one  such 
relational  system. 

COROLLARY  3.5.  //  R  is  a  geometrical  relation  over  the  class  of  all 
Euclidean  spaces  and  95  and  28  are  infinite  dimensional  Euclidean  spaces, 
then  the  relational  systems  <F,  R^y  and  (W,  R^y  are  arithmetically 
equivalent. 

The  argument  that  leads  to  3.5  can  be  extended  to  show  that  there  is 
no  collection  of  geometrical  relations  and  no  collection  of  their  first-order 
properties  that  can  distinguish  between  any  two  infinite  dimensional 
Euclidean  spaces.  It  should  be  clear  from  the  proof  given  above  that  if  we 
only  wanted  this  result  about  infinite  dimensional  spaces,  it  would  be 
possible  to  use  Theorem  3.1  of  Tarski-Vaught  [2]  directly  without  going 
through  the  generalization  of  that  method  given  in  Section  2.  However, 


62  DANA  SCOTT 

the  relation  between  truth  in  the  whole  space  and  truth  in  its  finite 
dimensional  subspaces  as  developed  here  in  Theorem  3.4  leads  to  an  even 
stronger  result  about  infinite  dimensional  geometry  as  is  explained  in  the 
next  section.  Furthermore,  it  allows  us  to  establish  a  criterion  for  de- 
termining whether  a  relation  defined  in  first-order  logic  in  terms  of  a 
given  geometrical  relation  is  again  such  a  relation.  This  criterion  is  pre- 
sented in  Theorem  3.6  below. 

First  it  must  be  made  clear  when  one  relation  is  definable  (in  first- 
order  logic)  in  terms  of  another  relation.  Let  R  be  a  geometrical  relation 
and  let  99  be  a  formula  in  the  first  order  theory  of  R  whose  free  variables 
are  all  contained  in  the  list  VQ,  vi,  .  .  .  ,  vp-i.  We  can  easily  think  of  <p  as 
defining  a  new  />-ary  relation,  S  say,  in  terms  of  R.  Of  course  this  definition 
must  be  made  relative  to  each  Euclidean  space  separately,  and  so  5  must 
be  thought  of  as  a  function  from  spaces  SB  to  subsets  S^  of  V*.  Finally  in 
precise  terms  we  say  that  S  is  defined  by  y  in  terms  of  R  if  for  each 
Euclidean  space  3$  and  for  all  sequences  x  e  V^\  we  have  <#o>  .  .  .  ,  xp-\> 
E  S<%  if  and  only  if  x  satisfies  qy  in  <F,  R^>.  Not  every  formula  <p  leads  to  a 
geometrical  relation  S,  however.  To  see  this,  let  qp  contain  freely  none  of 
the  variables  VQ,  .  .  .  ,  vp~i,  and  choose  the  formula  in  such  a  way  that  it 
expresses  a  property  of  spaces  true  in  only  one  dimension,  then  S  will  not 
be  geometrical.  The  test  that  S"  must  pass  to  be  a  geometrical  relation  is 
given  next. 

THEOREM  3.6.  Let  the  p-ary  relation  S  be  defined  by  the  formula  <p  in 
terms  of  the  geometrical  relation  R.  Suppose  further  that  the  total  number  of 
variables  in  q>,  including  the  free  variables,  is  m  +  1  .  Then  S  is  a  geometrical 
relation  if  and  only  if  for  all  Euclidean  spaces  SB  and  28  of  dimension  at  most 
m  and  all  isometrics  f  of  83  into  28,  we  have  for  all  sequences  x  e  F(co), 
<XQ,  .  .  .  ,xp-i>  e  5,^  if  and  only  if  <f(xQ),  .  .  .,  f(xp-i)y  e  S^. 


PROOF.  Obviously,  if  5  is  a  geometrical  relation,  then  it  satisfies  the 
condition  given  for  isometrics.  Suppose  then  that  5  is  not  a  geometrical 
relation.  Thus,  there  must  be  Euclidean  spaces  2$  and  28  and  an  isometry 
/  from  SB  into  28  and  a  sequence  x  e  V^  such  that  the  formulas 
<*o,  •  .  •  ,  Xp-i>  E  S^  and  </(#o),  •-,  /(*u-i)>  e  5TO  are  not  equivalent.  By 
the  symmetry  of  the  situation  we  need  treat  only  the  case  where 
<XQ,  •  •  •  ,  *p-i>  e  S<g  and  </(*o),  •  •  •  >  /(*n-i)>  £  S&-  Since  <£  defines  S,  we 
conclude  that  x  satisfies  </>  in  <F,  R^y  and  the  sequence  </(#o),  /(#i)»  •  •  •  > 
does  not  satisfy  <f>  in  <W,  R$&>-  -Due  to  the  fact  that  <£  has  only  free  vari- 
ables in  the  set  {VQ,  .  .  .  ,  vp-i}  we  can  assume  without  loss  of  generality 


DIMENSION  IN  ELEMENTARY  EUCLIDEAN  GEOMETRY  63 

that  [xi\i  <  co}  =  {XQ,  . . . ,  %-i}.  Now  by  hypothesis  p  <  m  +  I,  and  so 
there  exists  a  subspace  X  of  93  of  dimension  at  most  m  containing  the  set 
{XQ,  . . . ,  Xp-i} ;  in  particular,  if  33  is  of  dimension  at  most  m,  we  shall 
assume  X  =  V,  and  otherwise  that  X  is  of  dimension  exactly  m.  In  any 
case  we  can  conclude  with  the  aid  of  Theorem  3.1  that  the  sequence  x 
satisfies  </>  in  <X,  R^  r>  Xvy.  The  image  of  X  under  /  is  a  subspace  X'  of 
28  of  dimension  equal  to  that  of  X.  Let  Y  be  a  subspace  of  28  that  is 
either  equal  to  W  in  case  28  is  of  dimension  less  than  m  or  is  of  dimension 
exactly  m,  and  which  in  any  case  contains  X' .  By  the  same  "argument  as 
above  the  sequence  </(#o),  f(xi)>  •  •  •  >  does  not  satisfy  <£  in  <Y,  /^  r»  Y#>. 
Now  the  two  subspaces  X  of  SS  and  Y  of  26  are  themselves  isometric  with 
Euclidean  spaces  25'  and  833'  of  dimensions  at  most  m  by  isometrics  g  and  A 
where  g  is  from  F'  onto  X  and  A  is  from  Y  onto  W.  Let  /'  =  hfg,  which 
will  be  an  isometry  from  2$'  into  28'.  By  our  very  construction  we  can 
obviously  conclude  that  the  sequence  <g~l(xo),  g~l(xi),  . .  .y  satisfies 
cf>  in  <F',  7?sjv>  and  hence  ^^(^o),  •  •  . ,  g~l(xp~i)>  e  S^*  while  <A/(*0),  .  . . , 
hf(xj)-i)y  $  S^,.  This  finally  shows  that  S  does  not  satisfy  the  condition 
of  the  theorem,  which  completes  the  proof. 

4.  Axiomatic  Geometry.  In  his  paper  [3],  Tarski  presents  a  particularly 
neat  axiomatic  system  for  two-dimensional  Euclidean  geometry  in  terms 
of  the  basic  notions  of  betweenness  and  equidistance.  As  is  indicated  in  [3]  an 
axiomatization  for  any  finite  dimension  can  be  obtained  by  a  simple 
change  in  two  of  the  axioms.  All  these  axiomatic  theories  are  decidable, 
and  it  follows  from  the  method  in  Tarski's  monograph  [1]  that  there  is 
even  a  uniform  method  for  deciding  for  each  integer  m  whether  an  ele- 
mentary sentence  in  terms  of  betweenness  and  equidistance  is  true  in 
Euclidean  spaces  of  dimension  m.  It  is  to  be  shown  here  that  there  is  also 
an  effective  decision  method  for  the  class  of  sentences  true  in  infinite 
dimensional  spaces. 

Let  B  and  E  be  respectively  ternary  and  quaternary  geometrical 
relations  denoting  the  betweenness  and  equidistance  relations  in  Euclidean 
spaces.  The  first-order  theory,  then,  must  contain  a  ternary  and  a  qua- 
ternary predicate  symbol.  Let  $m,  m  <  co,  be  the  class  of  all  sentences  of 
this  first-order  theory  true  in  the  relational  systems  <F,  B^,  E^y  where  25 
is  an  w-dimensional  Euclidean  space.  Since  all  w-dimensional  spaces  are 
isometric,  the  theory  $m  is  complete.  In  this  section  we  shall  often  use  the 
word  theory  to  mean  any  class  of  sentences  of  the  first-order  logic  that  is 
consistent  and  is  closed  under  all  the  usual  rules  of  deduction;  while  a 


64  DANA  SCOTT 

complete  theory  is  a  maximal  such  class.  Let  $  =  0  $m  be  the  common 

part  of  all  these  theories,  that  is,  the  class  of  sentences  true  in  all  finite 
dimensions.  $  is,  of  course,  not  a  complete  theory,  but  it  is  a  decidable 
theory  as  will  be  shown  below.  One  further  theory  will  be  considered, 
namely  &*>  =  U  fl  &n,  that  is,  the  class  of  sentences  true  in  all  but  a 

finite  number  of  dimensions.  ^  is  a  theory  since  it  is  the  union  of  an 
increasing  sequence  of  theories,  but  what  is  surprising  is  that  ^  is  a 
complete  theory  and,  in  fact,  is  the  class  of  sentences  true  in  all  infinite 
dimensions.  We  turn  now  to  the  systematic  account  of  these  results. 

LEMMA  4. 1 .     <?m  ~  U  £n  ^  0 

n-fm 

PROOF.  In  words :  there  is  a  sentence  true  in  the  dimension  m  but  not 
true  in  any  other  dimension.  To  demonstrate  this  one  has  only  to  trans- 
late into  formal  logical  symbols  the  sentence  that  expresses  the  fact  that 
there  exists  a  configuration  of  m  +  1  distinct  and  mutually  equidistant 
points,  but  no  such  configuration  with  m  +  2  points.  Notice  that  the 
trivial  dimension  m  =  0  is  accomodated  quite  nicely. 

LEMMA  4.2.     //   Am    is    any    sentence    in    the    set  $m  ~  U  $n,  then 


REMARK.  The  symbol  cl(2£]  denotes  the  closure  of  the  set  of  sentences 
#"  under  the  rules  of  deduction  of  the  first-order  predicate  logic.  Thus  4.2 
expresses  the  fact  that  the  theory  $m  results  from  the  theory  $  by  the 
addition  of  any  single  axiom  chosen  as  indicated  in  the  hypothesis  of  the 
lemma. 

PROOF.  Assuming  the  hypothesis,  let  <f>  be  any  sentence  in  $m  and 
consider  the  implication  \Am  ~>  </>]  =  -\[Am  A  -,  </>].  Clearly  [Am-*fi]  e  <^m 
since  </>E(^m.  But  also,  for  any  n  =£  m,  Am$$n  and  so  ^Amt$n\ 
hence,  [Am  -><f>]  e  $n.  It  follows  at  once  that  \Am  -*  </>]  e  $.  This  argu- 
ment shows  that  6"m  C  cl($  w  {Am}).  The  obviousness  of  the  opposite 
inclusion  completes  the  proof. 

THEOREM  4.3.  The  only  finite  complete  extensions  of  the  theory  $  are 
the  theories  ^m,  m  <  o>. 

PROOF.  That  each  complete  theory  <£m  is  a  finite  extension  of  $  is  the 
content  of  Lemmas  4. 1  and  4.2.  Assume  then  that  $*  is  a  finite  complete 
extension  of  £  with  $*  ^  $m  for  all  m  <  co.  Let  A*  be  the  single  axiom 


DIMENSION  IN  ELEMENTARY  EUCLIDEAN  GEOMETRY  65 

needed  to  have  <f#  =  d($  w  {/!*}).  Since  &*^€m,  it  follows  that 
^*  t  $m>  Thus,  -«  A*  e  <^w,  for  all  m  <  at,  which  implies  -i  A*  e  <f .  Thus, 
the  theory  <f  *  would  have  to  be  inconsistent,  which  is  impossible. 

LEMMA  4.4.     //  the  sentences  Am  are  chosen  as  in  Lemma  4.2,  then 
co}). 


PROOF,     -i  Am  e  fl  ffn  by  construction,  and  hence  -,  Am  E  S^  for  each 

W>  TO 

m  <  co.  Thus,  cl($  w  {-i  Am\m  <  co})  C  $^.  Let  </>  be  any  sentence  in  &^. 
There  exists  an  integer  m  such  that  <f>  e  fl  $n-  Consider  the  implication 


n> 


[[-i  ZI0  A  -.  Ji  A  ...  A-I  Am-i]  -><£).  It  is  easy  to  see  that  this  sentence 
is  in  $  and  hence  0  e  c/{<f  v  (Am\m  <  co}).  The  converse  inclusion  is  thus 
established. 

LEMMA  4.5.  ^  zs  the  set  of  sentences  true  in  all  infinite  dimensional 
spaces  and  is  complete. 

PROOF.  This  lemma  is  a  direct  consequence  of  Theorem  3.4,  Corollary 
3.5,  and  the  definition  of  $^. 

THEOREM  4.6.  There  is  only  one  infinite  complete  extension  of  the  theory 
$  and  that  is  the  theory  $^. 

PROOF.  That  ^  is  a  complete  extension  of  <f  follows  from  Lemma 
4.5.  Let  ^  be  any  other  infinite  complete  extension  of  $.  We  have 
S\  ^  <^m  for  all  m  <  co.  In  the  notation  of  Lemma  4.2,  Am  $  $*  for  all 
m  <  (o.  Hence  -.  Am  e  ^  for  all  m  <  co.  This  last  implies  in  view  of 
Lemma  4.4,  that  ^  C  $^.  Since  both  these  theories  are  complete,  we 
conclude  that  ^  =  <f  *,  as  was  to  be  shown. 

THEOREM  4.7.     The  theories  $  and  &^  are  decidable. 

PROOF.  Let  <f>  be  any  sentence.  Count  the  number  of  variables  in 
<£,  say  m  +  1  .  Now  0  6  <f  ^  if  and  only  if  (f>  e  <$m  by  Theorem  3.4.  Since 
the  condition  <f>  e  £m  can  be  decided  effectively,  we  have  an  effective 
decision  procedure  for  &^  Finally,  notice  that  <f>  e  $  if  and  only  if 
<t>  e  fl  $n  ',  again  a  condition  that  can  be  checked  in  a  finite  number  of 

n<m 

steps.  The  proof  is  complete. 

THEOREM  4.8.  For  any  formula  (f>  with  all  free  variables  in  the  set 
{fc'o,  •  •  •  ,  Vp-i},  it  can  be  decided  effectively  whether  $  defines  a  p-ary  geo- 
metrical relation  in  terms  of  the  geometrical  relations  B  and  E. 


66  DANA  SCOTT 

PROOF.  The  formula  <f)  will  contain  only  m  +  1  variables.  According 
to  Theorem  3.6,  we  need  only  check  whether  </>  defines  a  geometrical 
relation  with  respect  to  Euclidean  spaces  of  at  most  dimension  m.  In  fact, 
it  is  sufficient  to  restrict  attention  to  one  Euclidean  space  3$  of  dimension 
exactly  m  and  only  consider  the  identity  isometries  from  Euclidean 
subspaces  of  SB  onto  themselves.  This  checking  can  be  carried  out  by 
seeing  if  the  relation  defined  by  <£  and  restricted  to  a  subspace  is  the  same 
relation  obtained  by  restricting  all  the  free  variables  in  <£  to  the  subspace 
and  relativising  all  the  quantifiers  in  </>  to  the  subspace.  But,  the  predicate 
of  being  in  the  least  subspace  spanned  by  a  given  number  of  points  is 
definable  in  first-order  logic  in  terms  of  betweenness  and  equidistance. 
Thus,  since  the  number  of  points  needed  for  specifying  a  subspace  is  at 
most  m  +  1 ,  we  can  translate  the  question  of  the  equivalence  of  the  two 
forms  of  the  relation  defined  by  c/>  into  a  single  first-order  sentence.  This 
sentence,  then,  need  only  be  checked  for  validity  in  dimension  m,  a 
process  that  is  effective. 

This  completes  the  formal  development  of  the  subject,  and  the  author 
would  like  to  conclude  with  some  informal  remarks.  An  amusing  point  to 
notice  in  the  arguments  of  this  section  is  that  any  sentence  Am  satisfying 
the  conditions  of  Lemma  4.2  must  necessarily  contain  at  least  m  +  2 
variables.  That  the  lower  bound  can  actually  be  attained  in  the  theory  of 
B  and  E  can  be  verified  by  writing  out  in  logical  symbols  the  sentence 
given  in  words  in  the  proof  of  Lemma  4. 1 . 

The  consequence  of  these  results  for  the  problem  of  axiomatizing 
Euclidean  geometry  is  that  the  theory  $  is  the  only  one  of  these  theories 
that  need  be  axiomatized,  for  we  have  shown  above  that  one  may  pass 
from  $  to  $m  simply  by  the  adjunction  of  the  sentences  Am.  Though  all 
details  have  not  been  completely  checked  by  the  author,  it  would  seem  that 
an  adequate  axiomatization  of  the  theory  6a  would  result  by  dropping 
axioms  Al  1  and  A12  of  the  system  given  by  Tarski  in  [3].  Finally,  the 
simplest  way  of  axiomatizing  infinite  dimensional  geometry  would  be  to 
add  to  &  an  infinite  list  of  sentences  expressing  the  fact  that  any  number 
of  mutually  equidistant  points  can  be  found. 

This  last  remark  about  infinite  dimensional  geometry  indicates  an 
immediate  difference  between  the  first-order  formalism  and  theories 
permitting  quantification  over  arbitrary  finite  sets  as  explained  for  the 
theory  $2'  in  [3],  For  it  is  seen  at  once  that  the  infinite  dimensional 
character  of  a  space  can  be  expressed  in  a  single  sentence  involving 


DIMENSION  IN  ELEMENTARY  EUCLIDEAN  GEOMETRY  67 

variables  ranging  over  arbitrary  finite  sets,  a  fact  clearly  not  true  in  the 
first-order  theories  in  view  of  Lemma  4.5.  Further,  there  seems  to  be  no 
hope  of  giving  a  simple  syntactical  method  like  the  counting  of  the 
number  of  variables  for  showing  the  relation  of  the  truth  of  a  sentence  in 
one  dimension  to  the  truth  in  another  dimension  in  these  extended 
theories  as  was  done  in  the  fundamental  result  for  our  investigation 
Theorem  3.4.  However,  Tarski  has  noticed  that  Corollary  3.5  about 
infinite  dimensional  Euclidean  spaces  still  holds  for  properties  of  relations 
formulated  in  the  extended  theory  with  finite  sets,  because  the  result  in 
Theorem  3. 1  of  [2]  remains  valid  in  this  generalization.  Hence,  even  from 
this  broader  view,  there  is  no  way  to  distinguish  between  infinite  dimen- 
sional Euclidean  spaces. 


Bibliography 

[1]     TARSKI,   A.,  A   decision  method  for  elementary  algebra  and  geometry.  Second 

edition,  Berkeley  and  Los  Angeles  1951,  VI +  63  pp. 
[2]    and  VAUGHT,  R.  L.,  Arithmetical  extensions  of  relational  systems.  Compo- 

sitio  Mathematica,  vol.  13  (1957),  pp.  81-102. 
[3] What  is  elementary  geometry ?  This  volume,  pp.  16—29. 


Symposium  on  the  Axiomatic  Method 


BINARY  RELATIONS  AS  PRIMITIVE  NOTIONS 
IN  ELEMENTARY  GEOMETRY 

RAPHAEL  M.  ROBINSON 

University  of  California,  Berkeley,   California,   U.S.A. 

1.  Introduction.  We  shall  consider  equidistance  and  the  order  of  points 
on  a  line  as  the  standard  primitive  notions  of  Euclidean,  hyperbolic,  or 
elliptic  geometry.  Here  equidistance  is  a  quaternary  relation,  whereas 
the  order  of  points  on  a  line  is  described  in  Euclidean  or  hyperbolic 
geometry  by  the  ternary  relation  of  betweenness,  and  in  elliptic  geometry 
by  the  quaternary  relation  of  cyclic  order.  Various  axiom  systems  have 
been  given  in  terms  of  these  primitive  notions;  see,  for  example,  Tarski 
[7]  for  the  Euclidean  case.  The  adequacy  of  other  proposed  primitive 
notions  for  geometry  will  be  judged  by  comparison  with  the  standard  ones. 

M.  Fieri  [4]  has  shown  that  a  ternary  relation,  that  of  a  point  being  equal- 
ly distant  from  two  other  points,  can  be  used  as  the  only  primitive  notion 
of  Euclidean  geometry  of  two  or  more  dimensions.  Indeed,  in  terms  of  this 
relation,  it  is  possible  to  define  equidistance  of  points  in  general,  and  the 
order  of  points  on  a  line.  The  same  ternary  relation  is  also  a  possible 
primitive  notion  for  either  of  the  non-Euclidean  geometries,  hyperbolic  or 
elliptic.  A  detailed  discussion  of  Pieri's  relation  is  given  in  Section  2. 

We  may  raise  the  question  whether  one  or  more  binary  relations  might 
serve  as  the  primitive  notions  in  some  of  the  geometries.  This  is  im- 
possible in  Euclidean  geometry  as  described  above,  since  the  primitive 
notions  of  equidistance  and  order  are  preserved  by  similarity  transfor- 
mations, and  no  non-trivial  binary  relation  is  so  preserved.  However,  let 
us  choose  a  unit  distance  in  the  Euclidean  space,  and  regard  the  property 
of  two  points  being  a  unit  distance  apart  as  a  new  primitive  notion.  Then 
only  isometric  transformations  preserve  the  primitive  notions.  The  prob- 
lem concerning  the  possibility  of  using  just  binary  relations  as  primitive 
notions  is  thus  reinstated. 

We  shall  suppose  at  all  times  that  a  ^-dimensional  Euclidean,  hyper- 
bolic, or  elliptic  space  is  under  discussion.  x  The  exact  value  of  p  is 
usually  immaterial,  except  that  we  shall  always  suppose  that  p  ^  2; 

1  Many  of  the  results  stated  for  elliptic  geometry  apply  also  to  spherical  geome- 
try, but  we  shall  not  go  into  this. 

68 


BINARY   RELATIONS    AS    PRIMITIVE    NOTIONS  69 

this  condition  will  be  understood  henceforth  without  explicit  mention.  (The 
one-dimensional  case  will  be  excluded  because  the  results  there  are  usually 
exceptional  and  rather  trivial.)  Only  the  standard  case  where  the  base 
field  is  the  field  of  real  numbers  will  be  considered.  Thus,  for  example, 
the  Euclidean  ^-space  will  be  regarded  as  the  direct  pth  power  of  the  field 
of  real  numbers.  The  points  of  the  space  will  be  denoted  by  A,  B,  C,  •  •  • , 
X,  Y,  Z,  or  by  these  letters  with  subscripts  or  superscripts.  (In  contrast 
to  this,  the  letters  a,  b,  c,  -  •  - ,  x,  y,  z  will  be  used  for  real  numbers,  with 
i>  J,  k,  •  -  -,  p,  q,  Y  reserved  for  natural  numbers). 

The  space  will  be  regarded  as  a  metric  space,  the  distance  from  A  to  B 
being  denoted  by  AB.  Thus  the  symbol  AB  always  denotes  a  non- 
negative  real  number  (unless  it  occurs  as  part  of  a  formula,  such  as 
AB  J_  CD,  which  is  defined  as  a  whole).  In  the  Euclidean  case,  the  dis- 
tance A  B  is  the  square  root  of  the  sum  of  the  squares  of  the  differences  of 
the  coordinates  of  A  and  B.  In  the  non-Euclidean  cases,  the  metric  will  be 
assumed  to  be  chosen  so  that  the  natural  unit  of  length  is  being  used. 

The  definability  of  a  notion  in  terms  of  given  notions  will  always  be 
understood  in  this  paper  as  elementary  (—  arithmetical)  definability.  2 
That  is,  aside  from  the  given  notions,  a  definition  will  use  only  the  con- 
cepts of  elementary  logic,  and  the  only  variables  used  will  be  A ,  B,  C,  •  •  • , 
which  range  over  points  of  the  given  space.  The  following  logical  symbols 
will  be  used:  A  (and),  v  (or),  -,  (not),  ->  (if  •  •  •  then  •••),«->  (if  and  only 
if),  A  (for  every),  and  V  (there  exists).  Identity  (between  points  of  the 
given  space)  will  also  be  regarded  as  a  logical  concept.  In  addition,  we 
shall  sometimes  use  equations  such  as  AB  =  CD.  Here  the  entire  equation 
may  be  regarded  as  a  convenient  notation  for  the  quaternary  relation  of 
equidistance  between  points. 

We  return  now  to  the  question  whether  some  of  the  geometries  might 
be  based  on  primitive  notions  which  are  all  binary  relations.  In  other 
words,  are  there  some  binary  relations  which  are  definable  in  terms  of  the 
usual  primitive  notions,  and  in  terms  of  which  the  usual  primitive  notions 
are  definable?  We  shall  show  that  in  elliptic  geometry,  it  is  possible  to 
use  a  single  binary  relation  as  the  only  primitive  notion.  3  In  particular, 


2  Some  problems  concerning  more  general  types  of  definability  are  studied  by 
Roy  den  [5]. 

3  This  result  was  found  independently  by  H.  L.  Roy  den  and  the  author,  shortly 
after  listening  to  a  lecture  by  Alfred  Tarski  on  the  primitive  notions  of  Euclidean 
geometry,  based  in  part  on  Beth  and  Tarski  [1],  in  the  spring  of  1956.  The  binary 
relation  used  by  Royden  was  AB  ^  yr/4. 


70  RAPHAEL  M.   ROBINSON 

as  shown  in  Section  3,  the  binary  relation  AB  =  n/2,  which  expresses 
that  the  two  points  A  and  B  are  at  a  distance  n/2  apart  (which  is  the 
maximum  possible  distance  in  the  elliptic  space),  is  a  suitable  primitive 
notion  for  elliptic  geometry. 

On  the  other  hand,  in  Euclidean  geometry  (with  a  unit  distance  given) , 
or  in  hyperbolic  geometry,  it  is  impossible  to  use  binary  relations  as  the 
only  primitive  notions.  This  is  proved  in  Section  4  for  binary  relations  of 
the  form  AB  =  d,  and  in  Section  5  for  binary  relations  in  general.  The 
difference  between  these  geometries  and  elliptic  geometry  is  due  mainly 
to  the  fact  that  the  elliptic  space  is  bounded.  In  fact,  as  shown  in  Section  6, 
the  local  properties  of  Euclidean  and  hyperbolic  spaces  are  expressible 
in  terms  of  a  binary  relation,  that  of  two  points  being  at  a  prescribed 
distance  apart. 

If  d  is  chosen  so  that  the  distance  d  is  definable  in  terms  of  the  usual 
primitive  notions,  then  the  system  based  on  the  binary  relation  AB  =  d 
as  its  only  primitive  notion  is  weaker  than  the  standard  one  so  far  as 
definability  of  concepts  is  concerned,  but  otherwise  it  is  incomparable. 
The  distance  d  cannot  always  be  definable,  since  there  are  a  non-de- 
numerable  infinity  of  possible  values  of  d,  but  only  a  denumerable 
infinity  of  possible  definitions.  The  problem  of  determining  which 
distances  d  are  definable  is  solved  in  Section  7. 

2.  Pieri's  ternary  relation.  As  mentioned  in  Section  1 ,  Fieri  has  shown 
that  in  Euclidean  geometry,  it  is  possible  to  define  the  equidistance 
relation  AB  =  CD  and  betweenness  in  terms  of  the  ternary  relation 
AB  =  BC.  His  argument  is  also  valid  in  hyperbolic  geometry.  We  give 
below  a  proof  somewhat  different  than  Pieri's,  and  then  show  how  to 
extend  it  to  the  elliptic  case. 

THEOREM  2.1.  Pieri's  ternary  relation  AB  —  BC  is  a  suitable  primitive 
notion  for  Euclidean,  hyperbolic,  or  elliptic  geometry. 

PROOF.  Let  the  symbols  bet  (A  ,B,C),  col  (A  ,B,C),  and  sym  (A ,  B,  C) 
express  respectively  that  B  is  between  A  and  C,  that  A,  B,  C  are  collinear, 
and  that  A  and  C  are  symmetric  with  respect  to  B  (that  is,  that  B  is  the 
midpoint  of  the  segment  joining  A  and  C).  Then  the  following  definitions 
are  valid  formulas  in  Euclidean  or  hyperbolic  geometry: 

AB  ^  BC  <->  (A  X)[BX  =  XC  ->  (V  Y)(A  Y  =  YB  =  BX)], 


BINARY    RELATIONS    AS    PRIMITIVE    NOTIONS  71 

bet  (AtB,  C)  ~£  ^  ,4  A  £  ^  C  A(A  X)[XA  ^AB  A  XC  ^CB ->X  =  B], 

col  (A,  B,  C)  <->  A  =  B  v  A  =  C  v  5  =  C 

v  bet  (B,  4,  C)  v  bet  (4,  B,  C)  v  bet  (4,  C,  #), 

sym  (4,  5,  C)  <->  (A  -X")[col  (A  B,  X)  A  4B  -  BX  <->  X  ==  4  v  X  =  C], 
4#  -  CD  <->  (V  A",  Y)[sym  (/I,  X,  C)  A  sym  (B,  X,  Y)  A  YC  =  CD]. 

Hence  betweenness  and  equidistance  are  definable  in  terms  of  Pieri's 
relation,  as  was  to  be  shown. 

We  shall  now  extend  the  result  to  the  elliptic  case.  4  Some  modifi- 
cations of  the  above  definitions  are  required.  The  definition  of  AB  ^  BC 
is  still  correct.  The  validity  of  the  next  definition  depends  on  how  we 
interpret  bet  (A ,  B,  C)  in  elliptic  geometry.  It  is  correct  if  we  understand 
this  to  mean  that  there  is  a  unique  shortest  line  segment  joining  A  and  C, 
and  that  B  is  an  interior  point  of  this  segment.  Notice  that  when  AC=7i/2, 
as  well  as  when  A  =  C,  there  is  no  such  point  B. 

The  definition  of  col  (A,  B,  C)  given  above  is  not  valid  in  elliptic 
geometry.  We  shall  give  a  definition  below  which  expresses  collinearity 
as  a  special  case  of  cyclic  order,  which  we  also  need.  Once  collinearity  has 
been  defined,  the  previous  definitions  of  sym  (A,  B,  C)  and  AB  =  CD 
may  be  used.  Notice  that  the  definition  of  sym  (A,  B,  C)  is  so  formulated 
that  the  relation  holds,  as  it  should,  when  A  =  C  and  AB  =  n/2. 

We  now  wish  to  define  the  cyclic  order  of  points  on  a  line.  We  start  by 
defining  recursively  a  relation  seq  (Ao,  A\,  •  •  •,  An)  for  n  ^  2,  as  follows: 

seq  (Ao,  AI,  A2)  *->bet  (Ao,  AI,  A2), 
seq  (AQ,Ai,  "-.An)  <->  seq  (A0,  AI,  --.An-i) 

A  bet  (An-2,  An-i,  An)  *An=£Ao*  -.bet  (An-i,  AO,  An). 

It  is  seen  that  seq  (Ao,  AI,  •  •  •,  An)  expresses  that  the  sequence  of  points 
^4o,  A  i,  •  -  -,  An  lie  in  this  cyclic  order  on  a  line,  and  divide  the  line  into 
intervals  such  that,  excluding  the  one  from  An  to  AO,  the  sum  of  the 
lengths  of  any  two  consecutive  intervals  is  less  than  n/2.  This  extraneous 
condition  concerning  the  lengths  of  the  intervals  may  be  removed  by 


4  A  reader  who  is  concerned  only  with  Euclidean  and  hyperbolic  geometry  may 
proceed  directly  to  Section  4. 


72  RAPHAEL  M.  ROBINSON 

putting 

ord  (A0,  Ai,---,  An)  <->  (V  X0,  Xlt  •  -  •,  X4n)[X<>  =  A0*X4  = 

A   •  •  •  A  Km  =  An  A  seq  (^T0,  -X"l,  •  ' 

Then,  as  is  easily  seen,  ord  (A$t  A\t  •  •  •,  An)  expresses  simply  that  the 
points  AQ,  Ai,  •  •  • ,  An  are  in  this  cyclic  order  on  a  line.  In  particular, 
ord  (A,  B,  C,  D)  is  the  basic  quaternary  relation  of  cyclic  order  in  elliptic 
geometry.  Furthermore, 

col  (A,  B,  C)  <-*  A  =  B  v  A  =  C  v  B  =  C  v  ord  (A,  B,  C), 

so  that,  as  previously  noted,  equidistance  is  also  definable.  Thus  Fieri 's 
ternary  relation  is  a  suitable  primitive  notion  for  elliptic  geometry.  (It 
may  be  noticed  that  the  relation  seq  (Ao,  AI,  •  •  -,  An),  defined  above  for 
all  n  J>  2,  was  needed  only  for  n  ^  12.) 

3.  Binary  primitives  for  elliptic  geometry.  In  elliptic  geometry,  Pieri's 
relation  is  definable  in  terms  of  any  distance  d  with  0  <  d  <  n/2.  The 
converse  holds  only  if  cos  d  is  algebraic.  In  this  case,  the  binary  relation 
AB  =  d  is  a  suitable  primitive  notion  for  elliptic  geometry.  A  detailed 
proof  is  given  only  for  d  =  n/2,  which  seems  to  be  the  most  interesting 
case,  since  the  condition  AB  —  n/2  expresses  that  the  polar  of  either  of 
.  the  points  A  or  B  passes  through  the  other. 

We  start  by  noticing  two  definitions  that  can  be  used  in  the  elliptic 
plane.  The  formulas 

col  (A,  B,  C)  «-»  (V  X)[AX  =  BX  =  CX  =  n/2] 
and 

AB±CD~A^B*C=£D*(V  X)[AX  =  BX  =  n/2  A  col  (C,D,X)] 

define  the  collinearity  of  three  points  and  the  perpendicularity  of  two 
lines,  in  terms  of  the  distance  n/2.  Here,  of  course,  a  notation  such  as 
AX  =  BX  =  n/2  is  short  for  AX  =  n/2  A  BX  =  n/2;  the  concept  of 
equidistance  is  not  involved. 

THEOREM  3. 1 .     The  following  formula  holds  in  the  elliptic  plane : 
BC  =  CA  =  AB  =  n/2  A  col  (B,  P,  C)  A  col  (C,  Q,  A)  A  col  (A,  R,  B) 

A  P  ^  C  A  Q  ^  C  A  AP  ±  QR  A  BQ  _L  PR  -+  AR  =  n/4. 


BINARY    RELATIONS   AS    PRIMITIVE    NOTIONS  73 

PROOF.  One  model  of  the  elliptic  plane  consists  of  all  lines  through 
the  origin  in  a  three-dimensional  Euclidean  space.  We  may  identify  A, 
B,  C  with  the  x,  y,  z  axes.  Then  AP  and  BQ  correspond  to  planes  z  —  ay 
and  z  =  bx,  with  suitable  values  of  a  and  b.  The  planes  through  P  perpen- 
dicular to  BQ  and  through  Q  perpendicular  to  A  P  are 

x  —  aby  +  bz  =  0,     —  abx  -f  y  +  az  =  0- 

These  planes  intersect  the  plane  z  —  0  in  lines  where  x  =  aby  and  y=abxt 
respectively.  These  lines  coincide  only  if  ab  =  ±  1.  Thus  the  point  R  is 
represented  by  one  of  the  lines  y  =  ±  x,  z  =  0. 

THEOREM  3.2.  The  binary  relation  AB  =  n/2  is  a  suitable  primitive 
notion  for  elliptic  geometry. 

PROOF.  We  can  define  the  distance  n]2  in  terms  of  Pieri's  relation,  by 
using  the  definition  of  bet  (A,  B,  C)  from  Section  2,  and  the  formula 

AB  =  n/2  <->  A  ^  B  A  -,(V  X)  bet  (A,  X,  B). 

It  remains  to  define  Pieri's  relation  in  terms  of  the  distance  n/2. 

Consider  first  elliptic  plane  geometry.  Notice  that  the  distance  n,'4  is 
definable.  Indeed,  we  see  that  AR=n/4  if  and  only  if  there  exist  points 
B,  C,  P,  Q  satisfying  the  conditions  stated  in  Theorem  3.1. 

We  now  give  a  scries  of  further  definitions  leading  to  Pieri's  relation 
AB  =  BC: 

mid  (A,  B}  C)  <->  col  (A,  B,  C)  A  (V  X)[AX  =  CX  =  n/4  A  AC  J_  BX], 

mex  (A,  B,  C)  <->  col  (A,  B,  C)  A  (V  X)[mid  (A,  X}  C)  A  BX  =  n/2], 
sym  (A,  Bt  C}  <-->  mid  (A,  B,  C)  v  mex  (A,B,C)v  A  =B  =  C 

v  (A  =  C  A  AB  =  n/2)  v  (AC  =  n/2  A  AB  =  BC  ==  jc/4), 
AB  =  BC  <-»  (V  X)[sym  (4,  X,  C)  *  BX  =  jt/2]. 

Here  the  conditions  mid  (A ,  #,  C)  and  mex  (A ,  B,  C)  require  that  A  ^  C 
and  AC  ^  n\1t  and  that  B  is  the  midpoint  of  the  shorter  or  longer  line 
segment  joining  A  and  C  (the  "internal"  or  "external"  midpoint).  The 
definition  of  sym  (A,  B,  C)  then  gives  a  complete  listing  of  the  cases  in 
which  A  and  C  are  symmetric  with  respect  to  B.  Notice  that,  in  the 
definition  of  AB  =  BC}  HA  ^  C,  then  there  are  just  two  possible  values 
of  X,  the  midpoints  of  the  two  segments  joining  A  and  C,  and  the  polar  of 
either  is  perpendicular  to  AC  at  the  other.  If  A  =  C,  then  X  may  be  A  or 


74  RAPHAEL   M.    ROBINSON 

any  point  on  the  polar  of  A  ,  and  B  is  completely  arbitrary,  as  it  should  be. 
The  restriction  to  plane  geometry  may  be  removed  by  noticing  that  it 
is  possible  to  define  the  concept  of  a  plane  in  ^>-space  in  terms  of  the 
distance  nj2.  We  can  then  define  the  relation  AB  =  EC  by  applying  the 
previous  method  in  a  plane  containing  A,  Bf  and  C. 

THEOREM  3.3.  In  elliptic  geometry,  equidistance  is  definable  in  terms 
of  the  distance  d,  for  any  d  with  0  <  d  ^  n\2. 

This  can  be  derived  from  Theorem  3.2,  by  defining  the  distance  n/2  in 
terms  of  the  distance  d,  but  the  details  of  the  proof  will  be  omitted. 
Combining  this  result  with  Theorem  7.3,  we  see  that  the  binary  relation 
AB  =  d  is  a  suitable  primitive  notion  for  elliptic  geometry  if  and  only  if 
0  <  d  5g  jr/2  and  cos  d  is  algebraic. 

4.  Patch-wise  congruence.  Let  any  Euclidean  or  hyperbolic  space  be 
given.  Then  we  put 

con  (X\,  X%,  -  —  ,  Xm\  X\,  Xz,  •  •  -,  Xmr)  <-> 

Xi'Xz  A  X\X%  =  Xi'Xs  A  •  •  •  A  Xm-\Xm  =  Xm_lXm'. 


That  is,  two  finite  sequences  of  points  are  called  congruent  if  all  the 
corresponding  distances  are  equal.  The  space  has  a  certain  property  of 
homogeneity  expressed  by  the  condition 

'  con  (Xi,  •  •  •,  Xm\  Xi,  •  •  •,  Xmr)  -> 

(A  Y)(V  Y')  con  (Xi,  '-,Xm,Y;  X^,  -  -  -,Xm',  Y')f 

which  holds  for  all  values  of  m.  The  only  other  fact  about  the  given  space 
that  we  use  in  this  section  is  that  the  space  is  unbounded. 

The  concept  of  congruence  will  now  be  extended  to  that  of  patch-wise 
congruence.  If  c  >  0,  the  formula 

pat  (c:  X\,  X2,  •  •  •,  Xm\  X\  ,  Xz',  •  •  •,  Xm') 

will  be  used  to  denote  that  the  two  sequences  X\t  X2,  •  •  •  ,  Xm  and  X\*t 
Xz,  •  •  •,  Xmf  are  patch-wise  congruent,  with  separation  constant  c.  This 
formula  is  defined  as  follows.  We  start  by  considering  any  partition  of  the 
indices  1,2,  •  •  •  ,  m.  For  this  partition,  we  form  the  conjunction  of  all  the 
formulas  XiXj  =  Xi'Xj  for  i  and  /  in  the  same  class,  and  of  all  the 
formulas  XtXj  >  c  and  Xi'Xj  >  c  for  i  and  /  in  different  classes.  The 
disjunction  of  all  these  conjunctions,  formed  for  all  possible  partitions, 
is  the  required  formula  expressing  patch-wise  congruence. 


BINARY   RELATIONS    AS    PRIMITIVE   NOTIONS  75 

The  formula  pat(c:Xi,  •••,^Tm;Xi/,  •  —  ,  Xm')  constructed  in  this 
way  actually  expresses  that  the  two  sequences  of  points  can  be  divided 
into  patches  which  are  respectively  congruent,  such  that  the  distance 
between  any  two  patches  is  greater  than  c.  We  shall  now  show  that  the 
formula  expressing  the  property  of  homogeneity  mentioned  above  may  be 
extended  to  patch-  wise  congruence. 

THEOREM  4.1.  In  any  Euclidean  or  hyperbolic  space,  and  for  any 
c  >  0,  we  have 

p*t(2c:Xi,---,Xm',Xl',  •••,Xm')-+ 

(A  Y)(V  Y')  pat  (c:  Xlf  -  -  -  ,  Xm,  Y;  AY,  •  •  -,  Xm',  Y'). 

PROOF.  Pick  out  one  disjunct  of  the  hypothesis  which  is  valid.  This 
determines  which  points  are  to  be  considered  as  belonging  to  the  same 
patch.  Now  if  Y  is  at  a  distance  greater  than  c  from  all  the  points  Xt, 
then  it  may  be  considered  as  forming  a  new  patch,  and  we  may  choose  for 
Y'  any  point  at  a  distance  greater  than  c  from  all  the  points  XJ.  Other- 
wise, there  is  a  unique  patch,  among  the  points  X\,  •  •  •  ,  Xm,  such  that 
Y  is  at  a  distance  at  most  c  from  some  point  of  the  patch.  Choose  Y'  in  a 
corresponding  position  relative  to  the  corresponding  patch  of  the  points 
Xi'. 

THEOREM  4.2.  Let  d  be  a  positive  number.  Let  a  Euclidean  or  hyperbolic 
space  be  given.  Let  ai,  <*2,  •  •  •  ,  aa  be  binary  relations  on  this  space,  such  that 
for  k  =  1  ,  2,  •  •  •  ,  q,  we  have 


XY  =  X'Y'  ->  [*k(X,  Y)  «->  *t(X',  Y')] 
and 

*k(X,  Y)  ->  X  Y  ^  d. 

Let  $  be  a  formula  with  free  variables  X\,  •  •  •  ,  Xm,  which  is  elementary  in 
terms  of  HI,  •  •  •  ,  ocq  ;  that  is,  the  atomic  formulas  of  <f>  have  the  form  X  =  Y 
or  *k(X,  Y).  Then 


provided  that  </>  does  not  contain  more  than  n  nested  quantifiers.  5 


5  It  can  be  shown  that  2nd  is  the  smallest  possible  separation  constant  which  can 
be  used  here. 


76  RAPHAEL   M.    ROBINSON 

PROOF.     By  induction  in  n.  The  result  is  clear  for  n  —  0,  since  on  the 
basis  of  the  hypothesis  about  patch-wise  congruence,  we  have,  for  any 
possible  i  and  /,  either  XiKj  =  Xt'Xj',  or  else  both  XiXj  >  d  and 
Xt'Xj'  >  d.  Hence 

Xt  =  X,~  X^  =  Xj',     ak(Xi9  Xf)  ~  «*(*,',  Xi) 

for  all  values  of  k,  and  the  conclusion  follows. 

We  now  assume  the  theorem  for  some  value  of  n,  and  prove  it  for  n  +  1  . 
It  will  be  sufficient  to  consider  the  case  in  which 


•  •  •,  Xm,  Y), 

where  \p  is  an  elementary  formula  with  the  indicated  free  variables  and 
containing  at  most  n  nested  quantifiers.  (For  the  truth-value  of  any 
admissible  formula  <f>  can  be  determined  from  the  truth-values  of  formulas 
of  this  form.)  Now  by  the  inductive  hypothesis,  we  have 

pat  (2nd:  Xlt  •  -  •,  Xm,  Y;  Xlf>  •  •  •,  Xm',  Y')  -» 

[v(Xif  •  •  •,  Xm,  Y)  <->  V(*i',  •  •  -,  Xm',  Y')], 
and  according  to  Theorem  4.  1  , 
pat  (2"+irf:  XL  •  •  -,  Xm;Xi',  •••,  Xm')  -> 

(A  Y)(V  Y')  pat  (2^:  Xlt  •  •  -,  Xm,  Y;  AV,  •  •  -,  Xm',  Y'). 
Combining  these  results,  we  sec  that 
pat  (2»+irf:  Xi,  •  •  •,  Xm\  Xi',  •  •  -,  -Ym')  -> 

[(V  Y)v(Xi,  •  •  -,  Xm,  Y)  ^  (V  YX^!',  -  -  -,  Xm',  Y')]. 


THEOREM  4.3.     Under  the  same  hypotheses  on  ai,  a2,  •  •  •  ,  a^,  equidistance 
is  not  definable  in  terms  of  them. 


PROOF.  Suppose  the  relation  X\X^  =  X^X^  were  definable  in  terms 
of  ai,  •  •  •  ,  ocq,  using  a  formula  containing  at  most  n  nested  quantifiers. 
Then,  by  Theorem  4.2,  we  would  have 

pat  (2»d:  Xl}  X2t  X*9  X^Xi',  XJ,  X8',  ^4')  -> 


This  is  certainly  false,  since  the  hypothesis  holds  whenever  we  have 
>  2»rf  and  Xi'XJ  >  2»d  for  all  i  and  /. 


BINARY    RELATIONS    AS    PRIMITIVE    NOTIONS  77 

THEOREM  4.4.  In  Euclidean  or  hyperbolic  geometry,  equidistance  is  not 
definable  in  terms  of  any  number  of  particular  distances. 

PROOF.  For  k  =  1,2,  •  •  • ,  q,  let  ocjc(X,  Y)  <-*  XY  =  dk,  where  dk  >  0. 
Then  «i,  012,  •••,««  satisfy  the  hypotheses  of  Theorem  4.2,  if  we  take 
d  =  max  (di,  d%,  •  •  •,  dq).  Now  apply  Theorem  4.3. 

5.  No  binary  primitives  for  Euclidean  or  hyperbolic  geometry.  We  now 

come  to  the  question  whether  there  are  any  binary  relations  <*i(X,  Y),  •  •  • , 
ocq(X,  Y)  which  are  suitable  primitive  notions  for  Euclidean  geometry 
(with  a  unit  distance)  or  for  hyperbolic  geometry.  To  be  suitable,  they 
should  be  definable  in  terms  of  equidistance  and  the  unit  distance  in  the 
Euclidean  case,  and  in  terms  of  equidistance  alone  in  the  hyperbolic 
case,  and  conversely.  To  show  that  this  is  impossible,  we  start  by  studying 
binary  relations  which  are  definable  in  terms  of  equidistance  and  r 
particular  distances  d\t  d%t  -  •  • ,  dr.  (Actually,  we  need  only  r  =  1,  d\  —  1 
in  the  Euclidean  case,  and  r  =  0  in  the  hyperbolic  case.)  In  the  hyperbolic 
case,  a  preliminary  theorem  is  needed. 

THEOREM  5.1.  In  hyperbolic  p-space,  it  is  possible  to  introduce  co- 
ordinates (xi,  x%,  •  •  • ,  Xp),  with  xi2  +  x%2  +  *  •  •  +  Xp2  <  1,  so  that  eAB 
can  be  calculated  from  the  coordinates  of  A  and  B  using  rational  operations 
and  the  extraction  of  square  roots. 

PROOF.  In  the  interior  of  the  unit  sphere  in  Euclidean  ^-space 
introduce  a  new  metric  [A ,  B]  by  putting  [A,  B]  =  0  if  A  =  B,  and 

AR-BS 
log   7 


BR-AS 

otherwise,  where  R  and  5  are  the  two  points  where  the  line  joining  A  and 
B  intersects  the  unit  sphere.  From  the  coordinates  of  A  and  B,  we  can 
calculate  successively  the  coordinates  of  R  and  S,  the  distances  AR,  BS, 
BR,  AS,  and  finally  e{A^1,  using  only  rational  operations  and  the  extrac- 
tion of  square  roots.  Now  it  is  known  that,  with  the  metric  just  introduced, 
the  interior  of  the  unit  sphere  in  Euclidean  ^-space  becomes  a  model  for 
hyperbolic  />-space.  (See,  for  example,  Hilbert  and  Cohn-Vossen  [2],  §  35.) 
Thus  the  theorem  restates,  from  a  different  viewpoint,  what  we  have  just 
proved. 

THEOREM  5.2.     In  a  Euclidean  or  hyperbolic  space,  any  binary  relation 
a(X,  Y)  which  is  definable  in  terms  of  equidistance  and  particular  distances 


78  RAPHAEL  M.  ROBINSON 

d\,  d%,  •  •  • ,  dr,  satisfies  the  condition 

XY  =  X'Y'  ->  [*(X,  Y)  «->  x(X',  Y')], 
and,  for  some  d  >  0,  one  of  the  conditions 

XY  >  d  -+  oc(X,  Y),     XY  >  d  -+  ^«(X,  Y). 

PROOF.  The  first  conclusion  is  clear,  since  there  is  an  isometric 
mapping  of  the  space  onto  itself  which  takes  X  into  X'  and  Y  into  Y'. 
This  mapping  preserves  the  equidistance  relation  and  the  particular 
distances,  and  hence  anything  definable  in  terms  of  them. 

We  turn  now  to  the  second  conclusion.  Suppose  that  t  >  0,  and  con- 
sider the  formula 

(A  X,  Y)[XY  =  t  ->  *(X,  Y)]. 

We  can  eliminate  all  point  variables  in  favor  of  real  variables,  by  intro- 
ducing coordinates.  In  the  Euclidean  case,  by  simply  squaring  all  equa- 
tions that  occur,  we  obtain  an  equivalent  formula  of  elementary  algebra, 
containing  only  /  and  d\t  d%,  •  •  • ,  dr  as  free  variables.  In  the  hyperbolic 
case,  we  use  Theorem  5. 1 ;  by  a  little  manipulation,  including  the  elimi- 
nation of  square  roots  by  introducing  additional  existential  quantifiers, 
we  again  obtain  an  equivalent  formula  of  elementary  algebra,  where  in 
this  case  e^  and  edl,  -  -  • ,  edr  play  the  role  of  free  variables. 

Following  the  procedure  of  Tarski  [6],  all  bound  variables  can  be 
eliminated,  if  we  allow  the  introduction  of  inequalities  (Tarski's  Theorem 
31).  If  numerical  values  are  assigned  to  di,  dz,  -  •  • ,  dr,  we  see  that  there  is 
a  real  number  d  such  that  the  resulting  formula  is  either  true  for  all  t  >  d 
or  else  false  for  all  t  >  d.  Thus  the  same  alternatives  hold  for  the  displayed 
formula  with  which  we  started.  In  the  first  case,  we  have  XY  >  d  -> 
<x.(X,  Y).  In  the  second,  taking  account  of  the  fact  that  the  truth- value  of 
a(X,  Y)  depends  only  on  XY,  we  see  that  XY  >  d  ->  -,a(X,  Y). 

THEOREM  5.3.  In  a  Euclidean  or  hyperbolic  space,  it  is  impossible  to 
find  binary  relations  ai,  a2,  •  •  • ,  aff,  which  are  definable  in  terms  of  equi- 
distance and  particular  distances,  and  in  terms  of  which  equidistance  is 
definable.  Thus  there  are  no  binary  relations  which  are  suitable  as  the  primi- 
tive notions  of  Euclidean  geometry  (with  a  unit  distance)  or  of  hyperbolic 
geometry. 

PROOF.  We  may  apply  Theorem  5.2  to  each  of  the  relations  <*#.  By 
replacing  a*  by  -la*  if  necessary,  we  may  assume  that  we  have  X  Y  >  d  -> 


BINARY   RELATIONS   AS    PRIMITIVE    NOTIONS  79 


,  Y),  and  hence  aA(X,  Y)  ->  XY  ^  dt  for  all  values  of  k.  The  proof 
is  completed  by  applying  Theorem  4.3. 

6.  Local  definability  of  equidistance.  Although,  as  shown  in  Section  4, 
equidistance  is  not  definable  in  terms  of  particular  distances  in  Euclidean 
or  hyperbolic  geometry,  nevertheless  equidistance  is  locally  definable  in 
terms  of  a  single  given  distance.  6  By  a  local  definition  of  equidistance 
AB  =  CD  (in  terms  of  a  given  distance  d)  is  meant  a  formula  which 
provides  a  necessary  and  sufficient  condition  for  this  equality,  on  the 
assumption  that  the  distance  between  each  two  of  the  four  points  does 
not  exceed  a  prescribed  bound.  We  shall  see  that  this  bound  can  be  taken 
arbitrarily  large,  although  the  formula  required  becomes  longer  as  the 
bound  increases.  (Throughout  this  section,  d  denotes  an  arbitrary 
positive  number.) 

THEOREM  6.1.     In  Euclidean  or  hyperbolic  geometry  : 

(a)  The  distance  2d  is  definable  in  terms  of  the  distance  d. 

(b)  Any  one  of  the  relations  AB  —  d,  AB  ^  d,  AB  <  d  is  definable 
in  terms  of  any  other  one. 

(c)  The  local  symmetry  relation  sym  (A,  B,  C)  A  AB  g  h  is  definable 
in  terms  of  the  distance  d  if  and  only  if  the  distance  h  is  definable  in  terms  of 
the  distance  d. 

PROOF,     (a)     We  may  use  the  formula  7 

AB  =  2d  <->  (V  X)(A  Y)[A  Y  =  d  A  BY  =  d  «-»  Y  -  X], 


6  At  the  time  this  paper  was  presented  to  the  Symposium,  I  knew  this  result  only 
for  Kuclidean  or  hyperbolic  geometry  of  three  or  more  dimensions.  A  few  days 
afterwards,  A.  Seidenberg  pointed  out  to  me  that  the  linkage  of  Peaucellier,  which 
enables  one  to  draw  a  line  segment  in  the  Euclidean  plane,  furnishes  a  local  defi- 
nition of  collinearity  in  terms  of  a  particular  distance,  and  that  this  in  turn  leads  to 
a  local  definition  of  equidistance.  Some  time  later,  Seidenberg  also  succeeded  in 
extending  the  result  to  the  hyperbolic  plane.  Subsequently,  the  author  found  a 
different  and  simpler  solution  to  this  problem.  The  method  used  here  can  also  be 
adapted  to  the  higher-dimensional  hyperbolic  spaces,  and  is  presented  below  in  this 
extended  form.  The  local  definition  of  midpoint  used  in  the  proof  of  Theorem  6.  l(c) 
is  a  modified  form  of  the  definition  suggested  by  Seidenberg  for  use  in  the  hyper- 
bolic plane. 

7  This  definition  of  the  distance  2d  in  terms  of  the  distance  d  uses  an  existential 
and  a  universal  quantifier.  In  Euclidean  geometry,  it  is  also  possible  to  define  the 
distance  2d  existentially  in  terms  of  the  distance  d,  that  is,  by  means  of  a  formula  in 
prenex  form  containing  only  existential  quantifiers.  (In  the  two-dimensional  case, 


80  RAPHAEL  M.    ROBINSON 

(b)  Whichever  of  the  three  relations  is  given,  we  can  easily  define 
AB  <2d,  since  this  expresses  that  the  spheres  of  radius  d  about  A  and  B 
overlap.  If  the  given  relation  is  AB  ^  d,  then  we  may  use  the  formula 

AB  <  d  <-»  (A  X)[AX  ^  d  ->  BX  <  2d] 

to  define  AB  <  d,  and  hence  AB  =  d  can  also  be  defined.  A  similar 
argument  applies  in  the  other  cases. 

(c)  We  see  that 

0  <  AB  <  2d  ->  [sym  (A,  B,  C)  <-+  B  ^  C 

A  (A  X,  Y)(AX  =  XB  =  BY  =  d  A  XV  =  2d  ->  CY  =  d)]. 

Indeed,  the  possible  values  of  X  lie  on  the  intersection  of  two  spheres 
AX  =  d  and  BX  —  d,  and,  since  0  <  AB  <  2d,  these  spheres  actually 
intersect.  The  possible  values  of  Y  are  those  symmetric  to  X  with  respect 
to  B.  The  only  point  C,  other  than  B  itself,  which  is  at  a  distance  d  from 
all  such  points  Y  is  the  point  symmetric  to  A  with  respect  to  B.  This 
formula  clearly  leads  to  a  suitable  definition  of  the  relation  sym (^4 ,  B,  C) 
A  AB  5*  d,  and  the  stated  result  then  follows  easily. 

THEOREM  6.2.  In  Euclidean  or  hyperbolic  geometry,  the  local  Fieri 
relation  AB  =  #C  ^  A  can  be  defined  in  terms  of  the  distance  d  if  and  only 
if  the  distance  h  can  be  defined  in  terms  of  the  distance  d. 

PROOF.  The  necessity  of  the  condition  follows  from  Theorem  6.1(b). 
To  prove  the  sufficiency,  we  need  only  show  that  the  relation  AB—BC  ^d 
is  definable  in  terms  of  the  distance  d.  The  proof  is  divided  into  three 
cases. 

CASE  1 .     Euclidean  p-spa.ce,  p  ^  3.  It  is  easily  seen  that 
AB  =  BC  !g  2d  *-»  (V  X,  Y,  Z)[AY  =YC  =  2d 

A  AX  =  XY  =  YZ  =  ZC  =  XB  =  BZ  -  d]. 

This  is  based  on  the  idea  of  taking  an  isosceles  triangle  whose  equal  sides 
are  2d  and  folding  it  along  the  line  joining  the  midpoints  of  the  two  equal 


use  a  network  of  equilateral  triangles.  In  three  dimensions,  use  twice  the  fact  that 
the  diagonal  of  an  octahedron  is  2*  times  an  edge,  and  similarly  in  higher  di- 
mensions.) Starting  from  this  fact,  it  is  possible  to  put  the  local  definition  of 
equidistance  in  an  existential  form,  and  to  define  all  algebraic  distances  existentially 
in  terms  of  the  unit  distance  (thus  sharpening  Theorems  6.3  and  7.1).  I  do  not  see 
any  way  of  doing  the  corresponding  things  in  the  non-Euclidean  cases. 


BINARY    RELATIONS    AS    PRIMITIVE    NOTIONS  81 

sides.  The  vertex  remains  at  an  equal  distance  from  the  two  ends  of  the 
base,  and  this  distance  may  be  any  amount  not  less  than  half  the  base  and 
not  exceeding  2d.  By  adjoining  the  condition  AB  ^  d,  we  obtain  the 
required  relation  AB  =  BC  ^  d. 

CASE  2.  Euclidean  plane.  There  is  a  well-known  linkage,  due  to 
Peaucellier,  which  can  be  used  to  draw  a  line  segment.  (See  Kempe  [3]  or 
Hilbert  and  Cohn-Vossen  [2],  §  40.)  Choosing,  for  the  lengths  of  all  links, 
distances  definable  in  terms  of  d  (for  example,  suitable  multiples  of  d), 
and  considering  three  positions  of  the  linkage,  we  obtain  a  local  definition 
of  collinearity  in  terms  of  the  distance  d.  Combining  that  with  the  formula 

AB  =  BC  *AC  <2d<-> 

(V  X,  Y)[X  ^  Y  A  AX  =  CX  =  AY  =  CY  =  d  A  col  (B,  X}  Y)], 
we  easily  obtain  the  required  result. 

CASE  3.  Hyperbolic  ^-space,  p  ^  2.  If  p  ^  3,  we  could  proceed  much 
as  in  Case  1  .  However,  we  shall  apply  a  different  method,  which  does  not 
exclude  the  case  p  =  2,  but  which  definitely  uses  the  non-Euclidean 
character  of  the  space.  In  fact,  we  see  that 


(V  X,  Y)[sym  (A,  X,  B)  A  sym  (B,  Y,  C)  A  XY  =  d]}. 

We  have  expressed  the  similarity  of  the  triangles  ABC  and  XBY,  which 
is  impossible  unless  the  triangles  are  degenerate.  Indeed,  in  hyperbolic 
geometry,  the  line  joining  the  midpoints  of  two  sides  of  a  triangle  is  less 
than  half  as  long  as  the  third  side.  Since  sym  (A,  B,  C)  is  locally  definable, 
we  can  obtain  a  local  definition  of  col  (A  ,  B,  C)  ,  at  least  under  the 
restriction  that  AC  =  2d.  From  this,  we  can  get  a  local  definition  of 
col  (#1,  #2,  #3),  without  such  a  restriction,  by  considering  three  values 
of  B  with  the  same  A  and  C.  We  can  then  proceed  to  a  local  definition  of 
Pieri's  relation  as  in  Case  2. 

THEOREM  6.3.     In  Euclidean  or  hyperbolic  geometry,  the  local  equi- 
distance  relation 


can  be  defined  in  terms  of  the  distance  d  if  and  only  if  the  distance  h  is 
definable  in  terms  of  the  distance  d. 


82  RAPHAEL  M.    ROBINSON 

PROOF.  The  condition  is  clearly  necessary,  and  the  sufficiency  can  be 
derived  from  Theorem  6.2  by  a  suitable  modification  of  the  method  used 
in  Section  2. 

7.  Definable  distances.  We  shall  now  determine  what  distances  t  are 
definable  in  terms  of  a  given  distance  d,  with  or  without  the  use  of  equi- 
distance,  or,  in  the  non-Euclidean  cases,  in  terms  of  equidistance  alone. 

We  start  by  giving  a  few  definitions  valid  in  both  the  Euclidean  and 
hyperbolic  geometries.  (With  some  modifications,  they  can  be  used  also 
in  the  elliptic  case.)  The  relation  of  equidistance  is  considered  as  given, 
and  notions  previously  defined  in  terms  of  equidistance  are  also  used.  In 
the  first  place,  we  have 

AB  =  CD  +  EF  <->  (AB  =  CD  A  E  =  F)  v  (AH  -  EF  A  C  =  D) 

v  (V  X)[bet  (A,  X,  B)  *AX  =  CD  A  XB  =  EF}. 

We  also  wish  to  define  perpendicularity.  A  special  case  is  covered  by  the 
formula 


AC  _L  BC  «->  A  ^  C  A  B  --£  C  A  (V  A^sym  (B,  C,  X)  A  AB  =  AX]. 
We  can  then  proceed  to  the  formula 

AB  _L  CD  <->  A  ^  B  A  C  ^  D  A  (V  X,  Y,  Z)[XZ  _[_  YZ  A  col  (A,  X,  Z) 

A  col  (B,  X,  Z)  A  col  (C,  Y,  Z)  A  col  (D,  Y,  Z)\, 

which  defines  perpendicularity  in  general. 

THEOREM  7.1.  In  Euclidean  geometry,  the  distance  t  is  definable  in  terms 
of  equidistance  and  the  unit  distance  if  and  only  if  t  is  algebraic.  The  algebraic 
distances  are  indeed  definable  in  terms  of  the  unit  distance  alone. 

PROOF.  Suppose  that  (A  A,  B}[AB  =  t  <->  </>(A,  B)]  is  a  valid  formula 
of  ^-dimensional  Euclidean  geometry,  where  <f>(A,B)  is  expressed  in 
terms  of  equidistance  and  the  unit  distance.  By  introducing  coordinates, 
it  can  be  transformed  into  a  formula  of  elementary  algebra,  with  t  as  its 
only  free  variable.  By  Tarski  [6],  the  bound  variables  may  be  eliminated, 
which  leads  to  the  conclusion  that  t  must  be  algebraic. 

It  remains  to  show  that  all  algebraic  distances  can  be  defined.  We  have 
already  defined  AB  =  CD  +  EF,  and  we  can  define  the  product  of  two 


BINARY    RELATIONS    AS    PRIMITIVE    NOTIONS  83 

distances  by  the  formula 

AB  =  CD-EF^(A  =  #AC  =  Z)) 

v  (V  P,  Q,  R,  S)[col  (P,  A,  B)  A  col  (P,  Q,  R)  A  P  ^  A 

A  AQ  _L  PS  A  £#  J_  PS  A  P0  =  1  A  PA  =  CD  A  <?#  =  EF]. 

Using  these  definitions  of  sum  and  product  of  two  distances,  we  can 
express  that  a  certain  distance  satisfies  a  given  algebraic  equation.  By 
the  use  of  suitable  inequalities,  which  are  also  definable,  we  can  isolate 
a  particular  root,  and  hence  define  that  AB  —  t,  where  t  is  a  given  alge- 
braic number. 

If  we  arc  given  only  the  unit  distance,  but  not  equidistance,  then 
equidistance  is  nevertheless  locally  definable.  All  of  the  concepts  used  can 
be  defined  locally,  which  is  sufficient  for  the  purposes  of  the  proof. 
(Notice  that  in  the  definition  of  the  product  of  two  distances  above,  we 
expressed  the  parallelism  of  the  lines  AQ  and  BR  by  the  existence  of  a 
common  perpendicular,  and  not  by  the  non-existence  of  a  point  of  inter- 
section, so  that  this  transition  would  be  possible.) 

THEOREM  7.2.  In  hyperbolic  geometry,  the  distance  t  is  definable  in 
terms  of  equidistance  if  and  only  if  e*  is  algebraic. 

PROOF.  Using  Theorem  5.1  and  Tarski  [6],  we  see  that  only  such 
distances  can  be  definable  in  terms  of  equidistance.  It  remains  to  show 
that  all  such  distances  are  definable. 

We  have  defined  the  relation  AB  =  CD  +  EF,  but  the  definition  of 
AB  —  CD-EF  does  not  apply  here.  Indeed,  this  product  formula  is  not 
definable,  since  if  it  were,  we  could  define  the  unit  distance  AB  —  1, 
which  is  impossible  since  e  is  not  algebraic.  But  we  shall  show  that  it  is 
possible  to  define  the  two  formulas 

cosh  AB  =  cosh  CD  +  cosh  EF,     cosh  AB  =  cosh  CD -cosh  EF. 

We  will  then  be  able  to  express  the  condition  that  cosh  AB  satisfies  a 
given  algebraic  equation,  and  hence  the  condition  that  it  is  a  given 
algebraic  number.  Thus  a  distance  t  will  be  definable  if  cosh  t  is  algebraic, 
or,  what  is  equivalent,  if  el  is  algebraic. 

The  definition  of  the  second  formula  follows  at  once  from  the  known 
formula  cosh  c  =  cosh  a  cosh  b  connecting  the  sides  of  a  right  triangle. 
Thus  we  have 

coshAB  =  coshCD-coshEF~(AB  =  CD*E  =  F)  v  (AB  =  EF*C=D) 
v  (V  X)[AX  _L  BX  A  AX  =  CD  A  BX  =  EF]. 


84  RAPHAEL  M.   ROBINSON 

Also,  since  2  cosh  x  cosh  y  =  cosh  (x  +  y)  +  cosh  (x  —  y),  we  see  that 
2  cosh  AB  =  cosh  CD  +  cosh  EF  A  CD  ^  £F  <-> 

(V  P,  <?,  7?,  S)[cosh  ,4£  =  cosh  PQ-cosh  RS 

A  C/)  =  PQ  +  #S  A  EF  +  #S  =  P01. 

which  leads  to  a  definition  of  2  cosh  AB  =  cosh  CD  +  cosh  £"JF.  The 
factor  2  on  the  left  could  be  removed,  if  we  were  able  to  define  the 
relation  cosh  XY  —  2.  This  can  be  done,  for  example,  by  a  judicious 
combination  of  the  above  formulas.  Indeed,  we  see  that 

cosh  AB  =  2  <+  A  ^  B  A  (V  P,  Q,  R,  S)  [cosh  PQ  =  cosh*  AB 

A  2  cosh  AB  =  cosh  7?S  +  1  A  2  cosh  RS  =  cosh  /!£  +  cosh 


Since  cosh  XX  =  1  ,  we  see  that  all  the  equations  on  the  right  are  special 
cases  of  the  formulas  which  we  have  defined,  so  that  this  furnishes  the 
desired  definition. 

The  proofs  of  the  last  two  theorems  will  be  omitted,  since  they  do  not 
require  any  essentially  new  methods. 

THEOREM  7.3.  In  elliptic  geometry,  the  distance  t  is  definable  in  terms 
of  equidistance  if  and  only  if  cos  t  is  algebraic. 

THEOREM  7.4.  The  distance  t  is  definable  in  terms  of  the  distance  d 
(where  d  >  0,  and  in  the  elliptic  case  also  d  ^  n/2)  if  and  only  if  the  stated 
condition  is  satisfied. 

(a)  Euclidean  case  :  tjd  is  algebraic. 

(b)  Hyperbolic  case:  et  is  algebraic  in  terms  of  ed. 

(c)  Elliptic  case  :  cos  t  is  algebraic  in  terms  of  cos  d. 

These  results  are  unchanged  if  the  relation  of  equidistance  is  also  considered 
as  given. 


Bibliography 

[1]     BETH,  E.  W.  and  A.  TARSKI,   Equilaterality  as  the  only  primitive  notion  of 

Euclidean  geometry.  Indagatiories  Mathematicae,  vol.  18  (1956),  pp.  462-467. 
[2]    HILBERT,    D.    and    S.    COHN-VOSSEN,    Anschauliche   Geometrie.    Berlin    1932, 

viii-f-310  pp.  [Knglish  translation:  Geometry  and  the  imagination.  New   York 

1956,  ix  +  357  pp.] 
[3]     KEMPE,  A.  B.,  How  to  draw  a  staight  line]  a  lecture  on  linkages.  London  1877, 

vH-51  pp. 


BINARY    RELATIONS    AS    PRIMITIVE    NOTIONS  85 

[4]  FIERI,  M.,  La  geometria  elementave  istituita  sulle  nozioni  di  'punto'  e  'sfera'. 
Memorie  di  Matematica  e  di  Fisica  della  Societa  Italiana  delle  Scienze,  ser.  3, 
vol.  15  (1908),  pp.  345-450. 

[5J  ROYDEN,  H.  L.,  Remarks  on  primitive  notions  for  elementary  Euclidean  and  non- 
Euclidean  plane  geometry.  This  volume,  pp.  86—96. 

[6]  TARSKI,  A.,  A  decision  method  for  elementary  algebra  and  geometry.  Second 
edition,  Berkeley  and  Los  Angeles  1951,  iv-f  63  pp. 

[7]    ,  What  is  elementary  geometry  ?  This  volume,  pp.  16—29. 


Symposium  on  the  Axiomatic  Method 


REMARKS  ON  PRIMITIVE  NOTIONS  FOR  ELEMENTARY 
EUCLIDEAN  AND  NON-EUCLIDEAN  PLANE  GEOMETRY 

H.  L.  ROYDEN 

Stanford   University,   Stanford,   California,    U.S.A. 

Introduction.  The  purpose  of  the  present  paper  is  to  explore  some 
relationships  between  primitive  notions  in  elementary  plane  geometry 
with  a  view  to  determining  the  possibility  of  defining  certain  notions  in 
terms  of  others.  All  of  our  primitive  notions  are  predicates  whose  argu- 
ments are  the  primitive  elements  (points  or  points  and  lines)  and  we  say 
that  a  primitive  F  can  be  defined  in  terms  of  a  primitive  G  relative  to  a 
deductive  system  S  if 

(x,y,z,  ...)[F(x,y,z,  ...)  o&(x,y,z,  ...)] 

is  a  theorem  in  S  where  0  is  a  sentential  function  involving  only  G  and 
logical  terms  in  its  formation  (cf.  [10]). 

Whether  F  is  definable  in  terms  of  G  depends  not  only  on  the  deductive 
system  S,  but  also  on  the  logical  basis  used  and  our  results  are  sometimes 
different  if  we  use  only  the  restricted  predicate  calculus  rather  than  a 
logic  which  contains  the  theory  of  sets.  Definitions  using  only  the  re- 
stricted predicate  calculus  will  be  called  elementary  and  the  others  set- 
theoretic.  In  the  present  paper  all  of  our  definitions  are  elementary  except 
for  part  of  Section  5  where  there  is  some  discussion  of  the  possibility  of 
definitions  using  variables  ranging  over  finite  sets  of  points. 

We  consider  here  both  Euclidean  and  non-Euclidean  geometry  and 
use  a  set  of  axioms  equivalent  to  Hilbert's  without  the  axioms  of  com- 
pleteness and  of  Archimedes.  We  shall  sometimes  supplement  these  with 
an  axiom  (PI 2)  to  the  effect  that  any  line  through  a  point  inside  a  circle 
has  a  point  in  common  with  the  circle.  One  of  my  purposes  here  is  to  show 
the  role  played  by  this  axiom  in  the  definability  of  concepts  in  elementary 
geometry. 

Theorem  1  shows  that  for  Euclidean  and  elliptic  geometry  this  axiom 
plays  an  essential  role  in  the  possibility  of  defining  order  in  terms  of 
collinearity.  With  regard  to  hyperbolic  geometry  the  situation  is  markedly 
different  and  order  can  be  defined  in  terms  of  collinearity  independently 

86 


REMARKS   ON    PRIMITIVE   NOTIONS  87 

of  this  axiom.  As  Menger  [3,  4]  has  pointed  out,  the  whole  of  hyperbolic 
geometry  can  be  built  on  the  notion  of  collinearity.  We  use  here  the 
elegant  definition  of  order  given  by  Jenks  [2],  but  our  treatment  of  the 
definition  of  congruence  differs  somewhat  from  that  of  Menger  and  his 
students  in  that  we  first  define  orthogonality  and  use  it  in  the  definition 
of  congruence. 

1.  The  basic  elementary  geometries.  Euclidean  geometry.  We  shall 
consider  two  systems  for  elementary  Euclidean  plane  geometry.  The  first 
is  the  system  £P  which  iises  the  undefined  primitives  /?  and  d  and  consists 
of  all  consequences  of  the  axioms  PI  -PI  2  listed  below.  Intuitively, 
P(xyz)  has  the  meaning  "x,  y,  and  z  are  collinear  and  y  is  between  x  and  z," 
while  6(xyzw)  has  the  meaning  "the  segment  xy  is  congruent  to  the  segment 
zw."  In  terms  of  these  notions  we  define  the  notion  of  collinearity. 


=df  p(xyz)  v  p(yzx)  v  p(zxy)  ; 
and  parallelism: 

n(xyuv)  —  df 


Thus  n(xyuv)  states  that  (x,  y)  and  («,  v)  are  pairs  of  distinct  points  lying 
on  distinct  parallel  lines.  Our  axiom  system  for  &  corresponds  to  Hilbert's 
axiom  system,  with  the  exclusion  of  the  axioms  of  Archimedes  and  of 
completeness,  and  is  equivalent  to  the  Axioms  A  1-1  2  of  Tarski  [12].  In 
fact,  our  axioms  are  taken  directly  from  Tarski  's  paper,  except  that  our 
version  P7  of  Pasch's  axiom  is  stronger  than  Tarski's  A7  and  together 
with  the  remaining  axioms  it  implies  Tarski's  A  12,  which  is  accordingly 
omitted  from  our  list. 

AXIOMS  FOR  & 

PI  (x)(y)[p(xyx)  =>  x  =  y] 

P2  (x)(y)(z)(u)[f(xyu)  &  P(yzu)  ^  p(xyz)] 

P3  (x)(y)(z)(u)[p(xyz)  &  ft(xyu)  &  (x  ^  y)  =>  fi(xzu)  v  p(xvz)] 

P4  (x)(y)d(xyyx) 

P5  (x){y)(z)[6(xyzz)  *>  (x  =  y)] 

P6  (x)  (y)  (z)  (u)  (v)  (w)  [d(xyzu)  &  d(xyvw)  =>  6(zuvw)] 

P7  (t)(x)(y)(z)(u)  (3v)[fl(ztu)  =>  l(ytv)  &  {ft(zvx)  v  ftuvx)}] 


88  H.    L.    ROYDEN 

P8    (t)(x)(y)(z)(u)(3v}(3w)[p(xut)  &  P(yuz)  &  (x  ^  u)  => 

P(xzv)  &  P(xyu>)  &  P(vtw)] 
P9    (x)  (y)  (z)  (u)  (xf)  (yf)  (zf)  (u'}  [d(xyx'yf)  &  d(yzy'zf)  &  d(xitx'ur)  & 

d(yuy'u')  &  p(xyz)  &  p(x'y'z')  &  (x  ^  y)  =>  d(zuz'u'}~} 
P10  (x)(y)(u)(v)(3z)[p(xyz)  &d(yzuv)] 
Pll  (ax)(3y)(3z)[~i(xyz)] 

It  should  be  noted  that  in  the  presence  of  the  other  axioms,  P8  is 
equivalent  to  the  following  axiom  : 

P8'     (x)(y)(z)(u)(v)[n(xyzu)  &  n(xyzv)  =>  A(zuv)] 

The  existence  axioms  in  ^  guarantee  the  existence  of  those  points  which 
are  the  intersections  of  lines  and  those  that  can  be  constructed  by  the 
use  of  a  "transferer  of  segments''  (P10).  If  we  wish  to  have  all  points  which 
can  be  obtained  by  the  use  of  compasses,  we  must  add  the  following 
axiom  : 

P  1  2    (x)  (y)  (z)  (%')  (zr)  (u)  (3/)  [d(uxuxf)  &  6(uzuz')  & 

P(uxy)  &  p(xyz)  =>  6(uyuy')  &  P(xfy'z')] 

This  is  precisely  Tarski's  axiom  A  13',  and  the  geometry  having  P  1-1  2  as 
axioms  will  be  referred  to  as  ^*.  It  is  equivalent  to  Tarski's  system  <^y'. 
In  the  presence  of  the  remaining  axioms  the  axiom  PI  2  is  equivalent  to 
the  axiom  PI  2'  which  is  stated  entirely  in  terms  of  the  notion  ft  and  its 
derived  notions  A  and  n: 

P12'  (X)(y)(z)(3u)(3v)(3w)[p(xyz)  &  (x  ^  y)  &  (y  ^  z)  => 

[X(xyw)  &  X(xuv)  &  7i(uyvw)  &  n(uwvz)} 

If  g  is  an  ordered  field,  we  define  the  (two-dimensional)  coordinate 
geometry  (£(gf)  as  the  set  of  all  ordered  pairs  x  =  (x\,  x$  of  elements  of  3f 
with  the  notions  ft  and  d  defined  as  follows: 


P(xyz)  = 

0  <  (xi  -  yi)(yi  -  *i)  &  0  <  (x2  - 
d(xyzu)  =df  [(xi  -  yi)2  +  (x2 


If  3  has  the  property  that  the  sum  of  two  squares  is  a  square,  we  call  $ 
Pythagorean  field.  If  $  is  a  Pythagorean  field,  then  (£($)  is  a  model  for  0 


REMARKS    ON    PRIMITIVE    NOTIONS  89 


Conversely,  any  model  for  &  is  isomorphic  to  K(g)  for  some  Pythagorean 
field  3f  .  The  models  for  ^*  are  isomorphic  to  the  geometries  &($)  where  % 
is  Euclidean,  i.e.  has  the  property  that  every  positive  element  is  a  square. 
Conversely,  each  such  geometry  is  a  model  for  ^*. 

Elliptic  geometry.  One  can  give  a  similar  set  of  axioms  for  elliptic  plane 
geometry  except  that  order  is  now  expressed  by  means  of  a  four-place 
relation  y(xyzw)  with  the  meaning  that  x,  y,  z,  and  w  are  collinear  and  the 
pair  (x,  y)  does  not  separate  the  pair  (z,  w).  Again  we  get  two  systems,  $ 
and  <^*,  depending  on  whether  or  not  we  include  the  axiom  corresponding 
to  PI  2.  This  axiom  is  the  following: 

E  1  2    (x)  (y)  (z)  (w){y(xyzw)  o  (3r)  (3s)  (3t)  (3u)  (3v)  [l(xyz)  &  Hyzw)  & 

l(xyt)  &  A(xuv)  &  l(wrs)  &  A(uyr)  &  X(uts)  &  h(vtr]  &  A(vzs)]}. 


Let  3f  be  a  Pythagorean  field.  Then  by  the  elliptic  geometry  ©(fjf)  we 
mean  the  set  of  ordered  triples  x  =  (x\,  xz,  #3)  ^  (0,  0,  0)  from  $f,  where 
(axi,  ax2,  axz)  is  taken  to  be  equivalent  to  (x\t  X2,  #3)  for  a  --£  0.  We 
define  h(x,  y,  z)  to  mean  the  triple  x,  y,  and  z  are  linearly  dependent]  d(xyzw] 
to  mean  that 


The  notion  y  can  then  be  defined  in  terms  of  the  order  in  gf  so  that 
becomes  a  model  for  $  and  all  models  of  $  are  isomorphic  to  ®(3r)  f°r 
some  5-  The  geometry  6®)  is  a  model  for  ^*  if  and  only  if  $  is  Euclidean. 


In  elliptic  geometry  we  can  introduce  the  binary  relation  oc(xy)  of  po- 
larity between  points  which  indicates  that  one  point  lies  on  the  polar  of 
the  other.  We  can  define  collinearity  in  terms  of  a  by  the  following 
equivalence  : 

o  (3*)[ato)  &  aty  &  ate]. 


Hyperbolic  geometry.  By  the  elementary  hyperbolic  geometry  3tf  we 
mean  the  geometry  which  follows  from  axioms  PI  -7  and  P9-1  1  together 
with  the  negation  of  P8.  If  we  assume  also  PI  2,  then  we  call  the  geometry 
f*. 

It  should  be  remarked  that  the  notion  n  which  we  defined  for  &  and 
*  here  means  non-intersection  rather  than  parallelism.  Parallelism  will 


90  H.    L.    ROYDEN 

be  denoted  by  ri  and  is  defined  as  follows: 

n(xyzw)  =df  (u)(3v){7i(xyzw)  &  [p(xuw)  ^>  p(zuv)  & 


In  ,/f  the  axiom  P  1  2  is  equivalent  to  the  following  axiom  which  asserts 
the  existence  of  parallels: 

H  1  2    (x)  (y)  (z)  (3u>)]  [~X(xyz)  =>  n'(xyzw)] 

Let  55  be  a  Pythagorean  field  and  e  be  a  positive  element  in  55  such  that 
for  every  x,  y  e  55  with  x2  +  y2  <  e  there  is  a  z  e  55  such  that  z2  = 
e  —  x2  —  y2.  Then  a  model  ,§(55,  0)  f°r  -^  is  obtained  by  taking  all  pairs 
x  =  (x\,  x%)  of  elements  from  55  subject  to  the  restriction  x\2  +  #22  <  e, 
where  the  basic  relations  are  defined  by  the  following  conditions: 

ft(xyz)  =df  f(*i  —  yi)(y2  —  22)  =  (*2  —  1X2)  (yi  —  *i)  & 

0  <  (*i  -  yi)(yi  -  *i)  &  0  <  (^2  -  y2)tV2  -  ^2)], 
and 

(e—xiyi—x2y2)2  __  (e 


^  _  "1 

__W22)  J  ' 


Every  model  of  Jf1  is  isomorphic  to  some  $)(J5,  ^).  If  e  is  a  square  then  55  is 
Euclidean  and  by  a  change  of  coordinates  we  may  take  e  =  1.  Every 
model  of  ,#"*  is  isomorphic  to  §(55)  =  §©»  0  f°r  some  Euclidean  field  55- 

2.  Relations  between  order  and  collinearity.  We  have  defined  collincarity 
in  our  geometries  in  terms  of  order,  i.e.  in  terms  of  ft  in  the  Euclidean  and 
hyperbolic  geometries  and  in  terms  of  y  in  the  elliptic  geometries.  In  this 
section  we  consider  the  possibilities  of  the  definitions  in  the  converse 
direction.  The  following  propositions  show  what  can  be  accomplished  in 
this  direction. 

PROPOSITION  1  .     In  &*  we  have  the  following  equivalence  : 
P(xyz)  v  f$(xzy)  o  (3u)(3v)(3w)[(x  =  y)  v  (x  —  z)  v  (y  =  z)  v 

(h(xyz)  &  X(xyw)  &  h(xuv)  &  n(uyvw)  &  n(uwvx)}]. 
PROPOSITION  2.     In  $*  we  have  the  following  equivalence  l  : 
y(xyzw)  o  (3r)(3s)(3t)(3u)(3v)[l(xyz)  &  X(yzw)  &  Ji(xyt)  & 

l(xuv)  &  A(wrs)  &  l(uyr)  &  X(uts)  &  A(vtr)  &  A(vzs)\ 

1  This  equivalence  was  first  pointed  out  and  used  by  Pieri  [6]  to  define  order  in 
Projective  Geometry! 


REMARKS   ON    PRIMITIVE    NOTIONS  91 

PROPOSITION  3.     In  3?  we  have  the  following  equivalence  2: 

fi(xyz)  o  (u)(v)(3w)[A(xyz)  &  X(wyv)  &  {h(wux)  v  A(wuz)}]. 

THEOREM  1  .  Order  can  be  defined  in  terms  of  collinearity  in  <?>*,  &*  and 
Jf  .  On  the  other  hand,  order  cannot  be  defined  in  $  and  0*  on  the  basis  of 
collinearity  and  congruence. 

The  possibility  of  defining  order  in  <f  *,  ^*  and  Jf  follows  from  Propo- 
sitions 1-3.  To  show  the  impossibility  of  defining  order  in  <^  and  2fi  solely 
in  terms  of  collinearity,  we  shall  use  the  method  of  Padoa  (cf.  [10]  and 
[11])  and  construct  the  following  model:  Let  %  be  the  smallest  field  con- 
taining all  algebraic  numbers,  an  indeterminant  w  ,and  closed  under  the 
operation  of  taking  the  square  root  of  a  sum  of  squares.  Thus  each  element 
of  %  is  an  algebraic  function  F(co)  with  algebraic  coefficients.  We  make  ft 
into  two  distinct  ordered  fields  $1  and  $2  by  taking  two  different  real 
transcendental  numbers  a>\  and  mz  and  inf^i  setting  F(CJ)  >  0  if  F(coi)  >0 
and  in  ftz  setting  F(eo)  >  0  if  F(a)z)  >  0.  If  we  form  the  coordinate 
geometries  (£($i)  and  (£($2)  (or  equivalently  ®(3fi)  and  ®(3?2))»  then  the 
natural  isomorphism  is  a  (1-1)  mapping  which  preserves  collinearity  and 
congruence  but  not  order. 

3.  The  notion  of  orthogonality.  Scott  [9]  has  introduced  the  notion 
T(xyz)  whose  meaning  is  that  x,  y,  and  z  form  a  triangle  with  a  right  angle  at 
x.  This  notion  can  be  defined  in  terms  of  congruence  as  follows: 


r(xyz)  =df  (3u)(3v)[(u  ^  y)  &  (u  ^  z)  &  (v  ^  y)  &  (v  ^  z)  &  6(xyxv)  & 

d(xzxu)  &  d(yzuv)  &  d(yzzv)  &  6(yzyu)] 

In  this  section  we  shall  show  that  collinearity  and  congruence  can  be 
defined  in  terms  of  r.  For  collinearity  we  have  the  following  proposition  : 

PROPOSITION  4.     In  6a,  3?  ,  and  &  we  have 

h(xyz]  o  (3r)[r(rxy)  &r(rxz)]. 

In  order  to  define  the  congruence  relation  6,  we  introduce  the  auxiliary 
relation  ^  defined  as  follows  : 

&  6(xyxz)]. 


2  This  definition  of  order  was  given  by  Jenks  [2]. 


92  H.    L.    ROYDEN 

It  is  easy  to  define  d  in  terms  of  the  notion  of  two  points  being  equidistant 
from  a  third  (cf.  Fieri  [7]).  But  this  latter  notion  can  be  defined  in  terms  of 
ju,  and  T  by  the  following  proposition  due  to  Scott  : 

PROPOSITION  5.     In  <£,  0*,  and  3tf  we  have 

d(xyyz)  o  (3r)[/t(yxz)  v  {p(rxz)  &  r(rxy)}]. 


Thus  we  can  define  d  from  r  if  we  can  define  ^  from  r.  This  is  accom- 
plished by  the  following  two  propositions,  the  first  of  which  is  due  to 
Scott.  Considerations  similar  to  the  second  are  found  in  Robinson  [8], 
Section  3. 

PROPOSITION  6.     In  0>  we  have 

p(xyz)  o  {[x  =  y  &  x  ~  •  z]  v  (3u)  (3v)  \r(uyz]  &  r(vyz)  &  r(yuv)  & 

r(zuv)  &  r(xyu)  &  r(xyv)  &  r(xzu)  &  T(XZV)]}. 
PROPOSITION  7.     In  $  and  3F  we  have 
u(xyz)  o  {[x  =  y  &  x  =  z]  v  (r)(3u)(3v)(3s)  [r(xry)  ^>  r(xrz)  & 

r(yxu)  &  r(zxv)  &  r(rxu)  &  r(rxv)  &  r(vzs)  &  r(uys)  &  h(xrs)]}. 

These  propositions  together  with  the  example  at  the  end  of  the  previous 
section  give  us  the  following  theorem  : 

THEOREM  2.  In  <£,  J^,  and  0*  the  notions  of  collinearity  A  and  con- 
gruence d  can  be  defined  in  terms  of  r.  Thus  in  £*,  &?*,  and  3?  we  may  use 
the  relation  r  as  the  sole  primitive  notion.  On  the  other  hand,  T  does  not  suffice 
for  the  definition  of  order  in  $  and  £P. 

4.  Collinearity  as  the  sole  primitive  in  «#**.  In  this  section  we  shall  show 
that  in  &*  the  notions  of  congruence  can  be  defined  in  terms  of  col- 
linearity A  (cf.  [1],  [4],  and  [5]).  Since  we  have  shown  in  the  previous 
section  that  the  notion  r  of  orthogonality  can  be  used  as  the  sole  primitive 
in  3tf  \  it  will  suffice  to  show  that  T  can  be  defined  in  terms  of  A. 

We  begin  by  using  an  auxiliary  relation  y)  defined  as  follows  : 


y>(xyuv)  =df  (3w)(3u')(3v')[n'(xyuw)  &n'(xyvv')  &n'(yxuu')  & 

n(yxvw)  &  n(xwu'u)  &  7i'(xwv'v}]. 
The  meaning  of  y)  can  best  be  seen  by  using  a  model  3?(2f,  1)  of  Jff*  and 


REMARKS    ON    PRIMITIVE    NOTIONS  93 

assuming  without  loss  of  generality  that  x  has  coordinates  (0,  0).  The 
field  $  is  of  course  a  Euclidean  field.  The  definition  of  y(xyuv)  then  states 
that  uv  and  xy  are  diagonals  of  a  quadrilateral  in  (£(5)  whose  third  dia- 
gonal passes  through  x.  Since,  in  the  Euclidean  geometry  of  (£($)*  x  is 
the  midpoint  of  the  diagonal  xy,  we  must  have  uv  parallel  to  xy  in  the 
Euclidean  sense. 

With  the  above  explanation  in  mind  we  see  that  for  points  x,  y,  z  of 
§($,  1)  with  x  —  (0,  0),  zx  will  be  perpendicular  to  xy  in  the  Euclidean 
geometry  of  K(3f)  it  and  only  if  there  are  points  u,  v,  r,  s,  and  t  such  that 
the  formulas  \p(xyuv),  y)(xzrs),  7i'(uvsr)t  n'(xtrs),  and  n'(txvu)  hold  in 
§(gf,  1).  However,  the  Euclidean  and  hyperbolic  notions  of  a  right  angle 
coincide  at  the  origin  of  (£($).  Thus  we  have  the  following  proposition: 

PROPOSITION  8.     In  Jtf*  we  have 
r(xyz)  o  (3u)(3v)(3r)(3s)(3t)[y(xyuv)  &y>(xzrs)  &cn'(uvsr]  & 

n'(xtrs)  &  n'(txvu)]. 

Since  ri  was  defined  in  terms  of  X  alone,  and  since  we  have  seen  that  r 
can  be  used  as  the  sole  primitive  notion  in  3F ,  we  have  the  following 
corollary : 

COROLLARY  3.  In  <$?*  collinearity  can  be  taken  as  the  sole  primitive 
notion. 

5.  Units  of  length.  In  the  elliptic  and  hyperbolic  geometries  we  have 
natural  units  of  length,  and  the  question  immediately  arises  whether  or 
not  the  notion  of  two  points  being  at  a  unit  distance  can  serve  as  a 
primitive.  In  the  elliptic  geometry  the  most  natural  distance  to  take  is 
one-half  the  length  of  a  straight  line.  We  call  this  distance  P  the  polar 
distance,  and  define  the  relation,  <x(xy),  to  mean  that  x  and  y  are  at  distance 
P.  This  notion  is  easily  defined  in  terms  of  congruence  and  collinearity, 
and  conversely  we  can  define  orthogonality  in  terms  of  it  as  follows : 

r(xyz)  o  (3^^)(3v){^x(ux)  &  OL(UZ)  &  QL(UV)  &  OL(VX)  &  <x.(vy)}. 

This  together  with  the  example  in  Section  2  gives  us  the  following 
proposition : 


3  I  suspect  that  this  corollary  is  still  true  if  Jjf*  is  replaced  by  3F ,  but  I  have  not 
carried  out  a  proof.  The  fact  that  there  are  no  parallels  in  a  model  of  JV  which  is  not 
a  model  for  JP*  complicates  considerations  of  this  sort,  but  the  method  of  Menger 
in  [5]  may  be  applicable. 


94  H.    L.    ROYDEN 

PROPOSITION  9.  The  binary  relation  a  of  two  points  being  at  the  polar 
distance  can  be  used  as  the  sole  primitive  notion  in  $*  but  not  in  $. 

On  the  other  hand,  if  we  use  the  notion  a'  (xy)  of  two  points  being  at  a 
distance  less  than  P/2,  we  may  define  <x.(xy)  as  ~(3u)[a  fat)  &  a'  (My)]. 
Thus  in  $  we  can  define  collinearity  and  congruence  in  terms  of  a'  and 
it  is  not  too  difficult  to  define  order  in  terms  of  a'.  Thus  we  have  the 
following  : 

PROPOSITION  10.  In  $  the  binary  notion  a'  of  two  points  being  closer 
than  half  the  polar  distance  may  be  used  as  the  sole  primitive. 

Robinson  [8]  has  shown  that  in  ^*  with  a  unit  of  length  introduced 
as  a  new  primitive  we  cannot  use  the  unit  of  length  to  define  collinearity 
or  congruence  in  elementary  terms.  If  the  points  x,  y,  and  z  are  within 
a  fixed  integer  multiple  of  the  unit  distance  of  one  another,  then  as 
Seidenberg  has  pointed  out  the  collinearity  of  xyz  can  be  defined  in  ^* 
in  terms  of  the  relation  of  two  points  being  at  a  unit  distance.  This 
definition  follows  from  the  principle  of  the  Peaucellier  inversor  (cf. 
Robinson  [8],  Section  6).  If  we  enlarge  our  logical  basis  to  include  finite 
sets  of  elements  and  add  to  ^*  the  axiom  of  Archimedes,  then  we  may 
use  the  unit  of  length  as  a  sole  primitive.  Similar  results  hold  for  hyper- 
bolic geometry. 

6.  Geometries  with  points  and  lines  as  basic  elements.  It  follows  from 
Robinson's  results  in  [8]  that  it  is  not  possible  to  find  a  binary  relation 
which  will  serve  in  elementary  terms  as  the  only  primitive  notion  for  M** 
and  ^*  even  if  we  adjoin  a  unit  of  length  to  ^*.  We  may  use  a  single 
binary  primitive  notion  for  these  geometries,  however,  if  we  cease  to 
regard  them  as  elementary  statements  about  relations  between  points  and 
instead  regard  a  geometry  as  a  class  of  statements  about  points  and  lines 
and  relations  between  them.  Thus,  we  can  define  a  hyperbolic  geometry 
3tf*  which  uses  the  single  primitive  e  of  incidence  between  point  and  line. 
In  terms  of  e  we  define  the  unary  notion  p(x)  of  being  a  point  as  follows: 


P(*)  =df  (y)(z)(3u)[{e(xu)  &  e(yu)}  v  (e(zy)  =>  e(zu)  &  e(*w)}]. 
With  this  we  define  collinearity  among  points  by 

h(xyz)  =df  (3w){p(x)  &  p(y)  &  e(xw)  &  e(yw)  &  e(zw)}. 
We  now  add  the  axioms  and  definitions  of  30?*  (together  with  some 


REMARKS    ON   PRIMITIVE    NOTIONS  95 

additional  axioms  to  ensure  that  elements  which  are  not  points  are  lines) . 
We  then  have  a  geometry  which  is  isomorphic  to  Jf*  when  relativised  to 
statements  which  only  contain  points  as  variables. 

A  similar  procedure  is  possible  for  Euclidean  geometry  with  a  unit  of 
length.  Let  £(xy)  be  the  binary  relation  which  states  that  x  is  a  point  at 
unit  distance  from  the  line  y  or  else  y  is  a  point  at  unit  distance  from  the 
line  x.  As  before  we  define  a  point  by  the  condition: 

P(*)  =df  (y)(z)(3w)[{£(xu)  &  f  (y«)}  v  {[(zy)  =>  f(*«)  &  CM}]. 

We  can  define  col  linearity  by  noting  that  five  distinct  points  are  collinear 
if  they  are  all  at  a  unit  distance  from  each  of  two  distinct  lines.  From 
this  we  can  construct  a  point  geometry  corresponding  to  &**  with  a  unit 
of  length. 

The  method  of  Lindenbaum  and  Tarski  [11]  enables  one  to  show 
that  in  Euclidean  geometry  without  a  unit  of  length  there  is  no  binary 
relation  between  points  and  lines  from  which  we  can  define  congruence. 


Bibliography 

[1]    ABBOTT,  J.  C.,  The  protective  theory  of  non-Euclidean  geometry.  Reports  of  a 

Mathematical  Colloquium,  University  of  Notre  Dame  Press  (1941—1944),  pp. 

13-51. 
[2]     JENKS,  F.  P.,  A  set  of  postulates  for  Bolyai-I^obatchevsky  geometry.  Proceedings 

of  the  National  Academy  of  Sciences,  vol.  26  (1940),  pp.  277-279. 
[3]    MKNGER,  K.,  Non-Euclidean  geometry  of  joining  and  intersecting.  Bulletin  of 

the  American  Mathematical  Society,  vol.  44  (1938),  pp.  821-824. 
[4]    f  A  new  foundation  of  non-Euclidean,  affine,  real  protective  and  Euclidean 

geometry.  Proceedings  of  the  National  Academy  of  Sciences,  vol.  24  (1938), 

p.  486. 
[5] ,  Neiv  protective  definitions  of  the  concepts  of  hyperbolic  geometry.  Reports 

of  a  Mathematical  Colloquium,  University  of  Notre  Dame  Press  Series  2, 

no.  7  (1946),  pp.  20-28. 
[6]    FIERI,  M.,  /  principi  della  geometria  di  posizione  composti  in  sistema  logico 

deduttivo.  Mcmorie  della  Reale  Accademia  delle  Scienzc  di  Torino,  vol.  48 

(1899),  pp.  1-62. 
[7]    ,  La  geometria  elementare  istituita  sulle  nozioni  di  'punto'  e  'sfera'.  Memorie 

di  Matematica  e  di  Fisica  della  Scienze,  scr.  3,  vol.  15  (1908),  pp.  345-450. 
[8]    ROBINSON,  R.  M.,  Binary  relations  as  primitive  notions  in  elementary  geometry. 

This  volume,  pp.  68-85. 


96  H.    L.    ROYDEN 

[9]    SCOTT,  D.,  A  symmetric  primitive  notion  for  Euclidean  geometry.  Indagationes 

Mathematicae,  vol.  18  (1956),  pp.  457-461. 

[10]    TARSKI,  A.,  Some  methodological  investigations  on  the  definability  of  concepts. 
Logic,  Semantics,  MetamathematiCvS,  Oxford  1956,  art.  X. 

[1 1] ,  On  the  limitations  of  the  means  of  expression  of  deductive  theories.  Logic, 

Semantics,  Metamathematics,  Oxford   1956,  art.  XIII   (joint  aticle  with    A. 
Lindenbaum). 
[12]    ,  What  is  elementary  geometry  ?  This  volume,  pp.  16-29. 


Symposium  on  the  Axiomatic  Method 


DIRECT  INTRODUCTION  OF  WEIERSTRASS  HOMOGENEOUS 

COORDINATES  IN  THE  HYPERBOLIC  PLANE, 
ON  THE  BASIS  OF  THE  ENDCALCULUS  OF  HILBERT  i 

PAUL  S/ASZ 

Lor  and  Eotvus    University  of  Budapest,   Budapest,   Hungary 

Introduction.  In  the  present  paper  let  any  system  of  "points"  and 
"lines"  be  called  hyperbolic  plane  for  which,  besides  the  groups  of  axioms 
of  incidence,  of  order  and  of  congruence  of  plane  I,  II,  III  of  Hilbert  [3|, 
[4 1  the  following  two  axioms  are  valid: 

AXIOM  IV].  Let  P,  Q  be  two  different  points  in  the  plane  and  QY  a  half- 
line  on  the  one  side  of  the  line  PQ,  then  there  exists  always  one  half-line  PX 
on  the  same  side  of  PQ  that  does  not  intersect  Q  Y,  while  every  internal  half- 
line  PZ  lying  in  the  <£  QPX  cuts  the  half-line  QY  (Fig.  1). 


Fig.  1 


Fig.  2 


AXIOM  1V2.  There  exists  a  line  SQ  and  a  point  PO  oiitside  it  in  the  plane, 
for  which  two  different  lines  could  be  drawn  through  PQ  that  do  not  intersect 
so  (Fig.  2). 

I  have  shown  [7],  [8]  that  these  axioms  imply  the  following  theorem. 

1  A  more  detailed  exposition  has  been  published  in  German  (see  [12]). 

97 


98  PAUL   SZASZ 

THEOREM.  //  s  is  an  arbitrary  line  and  P  an  arbitrary  point  outside  it, 
then  the  lines  drawn  through  P  and  intersecting  s,  form  the  internal  lines  of  a 
certain  <£  (pi,  p2)  (Fig.  3).  These  lines  pi,  p2,  which  do  not  intersect  s  any 
more,  are  called  parallels  to  s  through  P. 


Fig.  3 

This  Theorem  was  laid  down  by  Hilbert  [3]  as  Axiom  IV.  The  Axioms 
IVi,  IV2  mentioned  above,  form  together  with  the  axiom-groups  I,  II, 
III  apparently  weaker  assumptions  than  those  of  I,  II,  III,  IV  by  the 
quoted  author. 

In  the  work  cited  above  Hilbert  called  "ends"  the  points  at  infinity 
of  the  plane  defined  by  any  pencil  of  parallel  lines.  A  line  possesses,  in 
consequence  of  the  above  Theorem,  always  two  ends.  After  the  proof  of 
the  fundamental  theorem,  according  to  which  two  lines  neither  inter- 
secting each  other  nor  being  parallel,  must  have  a  common  perpendicular, 
Hilbert  was  able  to  prove  also  the  existence  of  that  line  which  possesses 
two  prescribed  ends.  From  this  it  follows,  that  a  determined  perpendicular 
can  be  dropped  on  a  line  from  an  end  not  belonging  to  it.  From  among  the 
preliminary  theorems,  stated  by  Hilbert  for  his  so  called  endcalculus,  I 
wish  to  stress  only  the  one  just  mentioned.  This  endcalculus  I  am  going 
to  explain  below,  in  §  1 . 

The  way  sketched  by  Hilbert  [3]  for  the  construction  of  hyperbolic 
geometry  in  the  plane,  leads  through  projective  geometry.  In  contrast 
to  that  way  there  will  be  created  in  the  present  paper  a  completely 
elementary  construction  of  hyperbolic  plane  geometry  by  means  of  direct 
introduction  of  certain  homogeneous  coordinates  and  an  independent 
foundation  of  hyperbolic  analytic  geometry.  Henceforth  these  coordinates 
will  be  called  the  Weierstrass  homogeneous  coordinates,  because  they  are 


HOMOGENEOUS   COORDINATES   IN   HYPERBOLIC   GEOMETRY 


99 


identical  with  the  well-known  ones,  if  one  assumes  the  axioms  of  conti- 
nuity, instead  of  Axiom  IVi,  making  the  incomplete  axiom-system 
complete  [9].  This  construction  of  hyperbolic  geometry  does  not  depend 
on  hyperbolic  trigonometry,  the  latter  being  a  consequence  of  the  analytic 
geometry  of  the  hyperbolic  plane,  founded  here.  2  Neither  do  I  make  use 
of  Euclidean  geometry,  and  therefore  my  exposition  may  be  called  an 
independent  elementary  foundation  of  hyperbolic  plane  geometry. 

1.  The  endcalculus  of  Hilbert.  The  distance-function  @(f)  and  those 
developed  from  it.  The  endcalculus  of  Hilbert,  somewhat  altered  for  my 
purpose,  follows. 

Let  a  right  angle  in  the  plane  be  given  with  the  vertex  0,  the  sides  of 
which  as  half-lines  have  the  ends  Q,  E  (Fig.  4). 


Fig.  4 

The  end  Q  (called  by  Hilbert  oo)  will  be  distinguished,  and  the  end- 
calculus  defined  for  the  ends  different  from  Q.  Such  an  end  a  should  be 
called  positive  when  the  lines  u&  and  EQ  are  lying  on  the  same  side  of 
the  line  &Q,  and  in  case  these  lines  lie  on  different  sides  of  @Q,  the  end  a 

2  For  the  case  of  the  assumption  of  the  axioms  of  continuity,  see  Szasz  [10]. 


100 


PAUL    SZASZ 


is  called  negative.  The  other  end  of  the  reflection  of  the  line  ocD  in  OQ 
should  be  denoted  with  —a,  and  the  other  one  of  OQ  with  0.  The  addition 
of  the  ends  is  defined  by  Hilbert  as  follows. 

Let  a  and  ft  be  ends  differing  from  Q.  The  reflections  of  0  in  <x&  and  fiQ 
should  be  denoted  with  0rt,  0$  respectively  (Fig.  5).  The  middle  point  of 


Fig.  5 


the  segment  O^Op  being  denoted  with  M,  we  define  as  the  "simi  a  +  /?"  the 
other  end  of  the  line  MQ. 

The  definition  of  the  product  might  be  expressed  simpler  by  intro- 
ducing, unlike  Hilbert,  the  following  distance-function  that  is  going  to  be 
essential  all  through  our  treatment  (cf.  S/asz  [11]). 

Directing  the  line  OQ  towards  Q,  let  us  draw  a  perpendicular  to  OQ  through 
the  end-point  A  of  the  segment  OA  —  t  regard  being  paid  to  sign.  Let  the 
positive  end  a  of  this  perpendicular  be  designated  with  (&(t) : 

(1)  a  =  <£(<) 

(Fig.  6).  Evidently  any  positive  end  a  corresponds  to  one  and  only  one 
distance  t  with  sign. 


HOMOGENEOUS    COORDINATES    IN    HYPERBOLIC    GEOMETRY        101 


Using  the  designation  (1)  we  define  as  the  "product  G\GZ'  of  the  positive 
ends  ai  =  (&(ti)  and  #2  =  &(tz)  the  end  <&(t\  +  tz),  i.e. 

(2)  @< 


Fig.  6 


Fig.  7 

(Fig.  7).  Further  we  agree,  that  for  positive  ends  a, 

(3)  a(- /?)  =  (- a)/*  ==- «0,      (-«)(- 0) 
«n^  /or  «ny  ^w^  differing  from  Q,  there  should  hold 

(4)  f-0  =  0-f  =  0. 


102  '  PAUL    SZASZ 

Thus  we  have  given  the  definition  of  the  multiplication  of  ends  differing 
from  Q  in  every  case,  this  being  equivalent  to  Hilbert's  definition. 

The  positive  end  E  is  by  designation  (1)  (£(0),  playing  the  part  of  the 
positive  unity  since  according  to  (2)  (£(/)($(0)  =  (£(0)(£(£)  =  ($(*).  That's 
why  we  introduce  the  designation 

(5)  C(0)  =  1, 
which  by  (2)  may  be  written  also  as 

(5*)  ®W@(-0  =  1. 

The  end  designated  with  0,  which,  according  to  (4),  under  multi- 
plication plays  the  part  of  zero,  behaves  under  addition  also  like  zero, 
because  evidently  for  any  end  differing  from  Q  holds 

(6)  f  +  o  =  0  +  |  =  f 
and 

(7)  f  +(-£)=  0. 

D.  Hilbert  showed  in  his  work  (cited  above)  that  in  the  endcalculus  defined 
in  such  a  way,  the  familiar  laws  are  valid  concerning  the  four  rules  of 
arithmetic.  Or,  using  a  modern  expression :  the  ends  differing  from  Q  form  a 
commutative  field.  This  field  moreover  has  the  fundamental  property  of 
any  positive  end  being  a  square.  Indeed  in  the  sense  of  (2),  we  have 


*<O»HT;. 

The  field  of  ends  different  from  Q,  can  be  made  an  ordered  field  by  the 
following  agreement :  let  a  be  called  greater  than  ft  (ft  less  than  a)  in  symbols 
a  >  ft  (ft  <  a),  in  case  the  end  a  —  ft  is  positive.  One  is  easily  convinced, 
that  for  positive  ends  a,  ft  in  case  of  a  >  ft  the  line  ftQ  lies  between  the  lines 
OQ,  ocQ,  and  vice  versa.  From  this  it  results  that  (&(t)  >  1  if  t  >  0,  and 
then  it  follows  at  once  that  in  general 

For  the  sake  of  brevity  it  is  also  suitable  to  introduce  besides  the 
distance-function  (£(t)  the  following  ones  too : 

u  ®(—  t] 

\  '  O/.t\ 


(9) 

S(t) 


2 

Sift 
T(0  = 


C(t) 


HOMOGENEOUS    COORDINATES    IN    HYPERBOLIC    GEOMETRY        103 

While  @(/)  is  the  analogue  of  the  exponential  function,  these  latter 
distance-functions  are  the  analogous  of  the  hyperbolic  functions.  For  the 
first  two  holds  e.g.  the  fundamental  formula 

(10) 

and  also  the  formulas 


(1  1) 
and 


C(a  +  6)  =  C(a)C(b)  +-  S(a)S(b) 


(12)  S(a  +  b)  =  S(a)C(b)  +  S(b)C(a) 

are  valid  for  them. 

Also,  these  distance-functions  in  (9)  remind  us  of  the  hyperbolic  func- 
tions, just  as  the  distance-function  @(£)  reminds  us  of  the  exponential 
function,  e.g.  it  satisfies  the  inequality  (8). 


Fig.  8 

2.  The  Weierstrass  homogeneous  coordinates  of  a  point.  An  arbitrary 
point  P  in  the  plane  may  be  characterized  (Fig.  8)  with  the  two  data 
mentioned  below.  One  of  them  is  the  other  end  of  line  PO,  let  it  be  a. 

a 
However  0a  being  the  reflection  of  (9  in  the  line  with  the  ends  — ,  Q, 


104  PAUL    SZASZ 

the  other  end  of  the  line  6  0Q  is  according  to  the  definition  of  the  sum  of 

or          or  ^ 

ends  (§  1 ) , 1 =  a,  that  is  to  say  the  line  aQ  is  the  reflection  of  OQ 

a 
in  the  previous  line  with  the  ends       and  Q.  Consequently  the  reflection 

P'  of  P  in  this  line  joining  the  ends  —  and  Q,  lies  in  OQ.  Now  the  distance 

OP'  —  t  taken  with  sign  on  the  line  &Q  directed  towards  £2,  is  the  other 
datum,  evidently  determining  P  together  with  the  end  a  mentioned 
before.  These  data  /,  a  should  be  called  mixed-coordinates  of  point  P.  By 
means  of  these  may  be  proved  the  following 

THEOREM.     The    points    of   the   hyperbolic    plane   and    the    end-triads 
(x\,  x>2,  xz),  built  with  ends  differing  from  U  for  which  holds 

(1)  *3a-*22-*ia=  1 

and 

(2)  A-3  >  0, 

are  put  in  one-to-one  correspondence.  This  correspondence  might  be  produced 
by  making  each  point  (t,  a)  given  in  mixed-coordinates,  correspond  to  the 
end-triad 


(3) 


=  S(t)  +  i 
=  <&(-  t) 
-  C(t)  +  i 


The  concept  of  inequality  (§  1)  is  made  use  of  in  the  proof. 

The  ends  x\,  x%,  #3  in  (3)  should  be  called  Weierstrass  homogeneous 
coordinates  of  the  point  the  mixed-coordinates  of  which  arc  t,  a.  From 

(3)  follows  for  the  case  t  —  0,  a  =  0,  that  the  Weierstrass  homogeneous 
coordinates  of  point  0  are 

(4)  *i  =  0,     x2  =  0,     #3=1. 

Later  on,  for  the  transformation  of  the  coordinates,  it  becomes  of 
fundamental  importance,  that  for  any  two  points  (xi,  %%,  #3)  and  (x\y  xz,  x 3) 
holds 

(5)  #3^3  —  #2*2  —  xixi  >  0. 


HOMOGENEOUS    COORDINATES    IN    HYPERBOLIC    GEOMETRY        105 

3.  The  equation  of  the  line.  Weierstrass  homogeneous  line  coordinates. 

The  derivation  of  the  equation  of  the  line  may  be  based  upon  the  two 
Lemmas  of  Hilbcrt  [3]  mentioned  below. 

LEMMA  1 .  a,  ft  being  ends  different  from  Q,  the  reflection  of  the  line  a/2 
in  ftQ  is  the  line  joining  the  end  2ft  —  a  with  Q. 

LEMMA  2.  For  the  ends  a,  ft  of  a  line  that  goes  through  (9  holds  cx.fi—  —  1 . 
This  plainly  follows  from  the  fact,  that  if  from  among  these  ends  the 
positive  one  is  a  =  (&(t),  then  the  other  one  is  evidently  ft  =  —  (£(—  t) ; 
their  product  is  really  —  g(—  /)(£(/)  =  —  g(0)  -----  1. 


Fig.  9 


Let  us  consider  first  a  line  possessing  the  ends  <?,  r\  differing  from  Q 
(Fig.  9).  Let  an  arbitrary  point  of  this  line  be  given  in  mixed-coordinates 

a 
(§  2)  P(t,  a).  Then  by  reflecting  the  plane  in  the  line  that  joins  the  end  - 

z* 

with  Q  and  after  that  translating  it  along  OQ  by  the  piece  —  tt  the  point 
P  goes  into  0.  The  ends  f ,  r\  by  this  reflection  go  into  the  ends  a  —  £, 
a  —  rj,  respectively,  according  to  Lemma  1,  and  the  latter  ends  go  into 
(&(—  t)(a  —  f),  ©(—  t)(a  —  rj),  respectively,  as  follows  from  the  definition 
of  the  product  of  ends  (§1).  Since  this  line  goes  through  point  0  already 
(because  P  is  turned  into  0) ,  the  product  of  these  two  ends  due  to  Lemma 


106  PAUL    SZASZ 

2is(§l,(2)) 

(1)  e(-2)( 

It  may  be  seen  at  once,  that  conversely,  if  (1)  holds  for  a  certain  point 
(t,  a),  then  this  point  lies  on  the  line  £77.  That  is  to  say  (1)  is  the  equation 
of  the  line  connecting  the  ends  £ ,  rj  expressed  in  mixed-coordinates. 

Now  the  equation  ( 1 )  can  be  transformed  into  Weierstrass  homogeneous 
coordinates  x\,  x%,  x%.  Namely,  we  obtain  from  formulas  (3)  of  the  pre- 
ceeding  section,  by  multiplying  (1)  with  ®(rf),  that  the  line  joining  the  ends 
£ >  ?!  differing  from  Q,  has  the  equation 

(2)  (ft;  -  l)*i  +  (£  +  rj)x2  -  (ft;  +  l)*s  =  0 

in  Weierstrass  homogeneous  coordinates. 

In  mixed-coordinates  t,  a  the  equation  of  the  line  j\Q  with  the  end  77,  is 
evidently  a  —  r\  =  0.  Multiplying  by  &(—  t)  and  writing  in  terms  of  the 
coordinates  x\,  X2,  #3  we  see,  that  the  equation  of  the  line  connecting  the  end 
r)  with  Q  is  in  Weierstrass  homogeneous  coordinates 

(3)  yxi  +  X2  —  r)Xz  =  0. 
By  introducing  the  designations 


/4)  u  —  _J v  —  '        20  __     I   ' 

equation  (2)  divided  by  f  —  17  takes  the  form 

where 

The  ends  u,  v,  w  in  (4)  should  be  called  the  Weierstrass  homogeneous  line 
coordinates  of  the  line  £7]  directed  towards  f ,  and  (2*)  the  normal-form  of  the 
equation  of  this  line. 

Per    definitionem,    the    Weierstrass    homogeneous    line    coordinates    of 
the  line  connecting  the  end  TJ  with  Q  and  directed  towards  Q  are  to  be 

(4*)  u  =  rj,     v  =  1,     w  =  7] 

and  further  let  (3)  be  the  normal-form  of  the  equation  of  this  line.  By 
reversing  orientation,  the  line  coordinates  are  multiplied  by  (—  1)  and  the 
equation  multiplied  with  (—1)  should  be  called  the  normal  form. 


HOMOGENEOUS    COORDINATES    IN    HYPERBOLIC   GEOMETRY        107 

It  may  be  easily  shown,  that  every  equation  (2*)  in  which  (5)  holds  for 
the  coefficients  u,  v,  w  is  the  normal- form  of  the  equation  of  a  certain  directed 


4.  Transformation  of  the  Weierstrass  coordinates.  Let  us  take  beside 
the  right  angle  Q(9E  that  we  have  used  in  the  definition  of  the  endcalculus, 
yet  another  right  angle  Q'O'E'  where  Q'  and  E'  are  ends.  Consider  the 
congruence  transformation  of  the  plane  into  itself,  that  superposes  the 
right  angle  Q'O'E'  on  QOE.  A  certain  directed  line  e  should  be  transformed 
into  e'  by  this  transformition.  We  mean  by  Weierstrass  homogeneous  line 
coordinates  of  the  directed  line  e  with  respect  to  the  "coordinate-system" 
Q'O'E'  the  ones  of  e'  with  respect  to  the  original  system  QOE. 

We  define  in  a  similar  way  the  Weierstrass  homogeneous  coordinates  of  a 
point  P  with  respect  to  the  coordinate-system  Q'O'E'. 

The  connection  of  the  new  coordinates  with  the  old  ones  can  be  con- 
sidered first  for  the  line  coordinates,  namely  by  making  use  of  the  fact, 
that  a  congruence  transformation  of  the  plane  into  itself,  transforms  every 
end  f  differing  from  Q  into  the  end 

t,  =  «e  +  P 


(it  transforms  in  case  ofy--j£0  the  end  --  into  Q  and  this  latter  into  the  end 

7 
—  ),  where  the  coefficients  a,  8,  y,  d  depend  only  on  the  new  system  Q'O'E' 

7 
and 

ad  —  fly  =  ±  1 

holds,  according  as  a  correspondence  in  the  same  or  in  the  opposite  sense  is 
involved  [6].  3 

On  the  basis  of  this  fact,  we  obtain,  that  the  new  line  coordinates  ex- 
pressed in  terms  of  the  original  ones  are 

u'  —  a\\u  +  #12^  + 


(1) 


v  — 


where  the  coefficients  a^  depend  only  on  the  new  system,  and  among  which 
3  For  the  proof  see  Gerretsen  [2],  Szasz  [11],  [12] 


108 

the  relations 

(2) 
and 


PAUL    SZASZ 


0n2 


02i 


2  - 


03i 


2  = 


0122  +  0222  —  0322  = 


0132  "  0232 


033 


2  — 


011012  +  021022  —  031032  =  0 

(3)  •    fl  12013  +  022023  —  032033  =  0 

013011  +  023021  —  033031  —  0 

are  valid',  further,  the  discriminant  of  this  transformation  is 


(4) 


D  — 


«2i     022     #23 

031      032      033 


From  this  follows  by  means  of  a  simple  consideration  making  use  of 
the  inequality  (5)  of  §  2,  that  the  new  coordinates  of  a  certain  point  ex- 
pressed in  terms  of  the  original  ones  are 


(5) 


xi   =  ±  (an#i  +  ai2x2  +  013*3) 

X2*  =   it   (021*1   +  022*2  +  023*3) 
*3'  =   it   (031*1  +  032*2  +  033*3) 


where  the  coefficients  a^  are  the  same  as  those  in  ( 1 )  and  the  sign  +  or  —  is 
valid,  if  the  two  coordinate  systems  have  the  same  or  the  opposite  sense, 
respectively. 

5.  Distance  of  two  points.  The  geometrical  significance  of  the  ex- 
pression uiU2+viv% — w\w%  for  two  lines.  Distance  of  a  point  from  a  line. 

Choosing  the  new  coordinate-system  suitably,  it  follows  from  the  formulas 
of  the  coordinate- transformation  (§  4,  (5)),  by  means  of  the  relations  be- 
tween the  coefficients  (§  4,  (2),  (3)),  that  for  the  distance  d  of  the  points 
(*i>  *2>  *s)  0W07  (#1,  %2,  #3)  in  the  original  system  holds 

(1)  C(d)  =  #3*3  —  *2*2  —  *i*i. 

This  formula  (1)  discloses  the  simple  geometrical  significance  of  the 
third  coordinate  #3  at  once.  Namely  by  taking  as  second  point  0  the 
coordinates  of  which  are  0,  0,  1  (§  2,  (4)),  formula  (1)  expresses  that  the 


HOMOGENEOUS    COORDINATES    IN    HYPERBOLIC    GEOMETRY        109 

third  coordinate  #3  of  a  point  P,  determined  by  the  distance  d  =  OP,  is 
(2)  *3  =  C(d). 

Similarly,  from  the  formulas  of  the  transformation  of  the  line  coordi- 
nates (§4,  (1),  (2),  (3))  by  taking  the  coordinate-system  suitably,  follows 
in  succession,  that 

(i)  for  the  directed  lines  s\,  s%  intersecting  each-other,  one  has 

=  T(a) 


where  a  designates  the  distance  with  sign  of  the  foot  of  the  perpendicular 
dropped  from  the  end  of  s%  falling  in  the  positive  direction,  upon  $1  (Fig.  10). 


s2(u2.v2Jw2) 


& 


Fig.  10 


(ii)     for  lines  si,  $2  possessing  a  common  perpendicular  and  directed 
equally  one  has 

u\u%  +  V]V%  ~  wiW2  =  C(a) 

where  a  signifies  the  piece  of  the  common  perpendicular  between  s\_  and  so 
(Fig.  11). 

(iii)     for  parallel  lines  directed  equally  (Fig.  1  2)  one  has 


From  these  theorems  and  the  behavior  of  the  functions  C(t)  and  T(t) 
follows,  that  the  lines  (u\,  v\,  w±)  and  (u<z,  V2,  w$  differing  from  each-other 
1)  meet  if  and  only  if 


110 


PAUL    SZASZ 


in  particular  they  are  perpendicular  if,  and  only  if, 

=  0; 


& 


Fig.    11 


n 


.TL 


& 

Fig.  12 
2)    have  a  common  perpendicular  if,  and  only  if, 


HOMOGENEOUS    COORDINATES    IN    HYPERBOLIC    GEOMETRY        111 

3)  are  parallel  if,  and  only  if, 


Finally,  it  follows  from  a  suitable  choice  of  the  new  coordinate-system 
of  the  same  sense,  and  from  the  formulas  with  respect  to  line  and  point 
coordinates  together,  that  for  the  distance  t  of  the  point  (x\,  xz,  #3)  from  the 
directed  line  (u,  v,  w)  one  has 


(3)  S(t)  =  UXi  +  VX2  — 

where  t  should  be  taken  positive  or  negative,  accordingly,  as  the  point  is  on 
the  positive  or  negative  side  of  the  line. 


Fig.  13 

This  theorem  discloses  the  simple  geometrical  meaning  of  the  first  two 
coordinates  x\,  x%.  Namely,  since  the  end  E  in  the  endcalculus  is  f  =  1 
and  the  other  end  of  the  line  OE  is  rj  =  —  1 ,  therefore  the  line  coordinates 
of  this  line  are  —  1,  0,  0  (§  3,  (4)),  thus  for  the  signed  distance  —  a  of  the 
point  (xi,  X2,  #3)  from  the  line  OE  one  has  by  formula  (3),  —  x\  =  S(—  a), 
or 

(4)  Xl  =  S(a). 

Since  moreover  the  line  coordinates  of  the  line  OQ,  directed  towards  Q, 
are  0,  1,  0  (§  3,  (4*)),  for  the  signed  distance  b  of  the  point  (x\,  X2,  #3) 
from  this  latter  line  one  has  by  (3), 

(5)  *2  =  S(b). 


1  12  PAUL    SZASZ 

Combining  these  results  (4)  and  (5)  with  that  of  (2),  we  may  state,  that 
if  the  distance  of  a  point  from  the  line  OE  is  a,  from  OQ  is  b,  from  the  point  0 
is  d,  and  we  take  the  distance  a  on  the  right  side  of  OE  for  positive  (Fig.  1 3) , 
the  distance  b  over  OQ  for  positive  as  well,  and  both  on  the  other  side  for 
negative,  then  in  the  endcalcuhis  with  respect  to  the  right  angle  QOE  the 
Weierstrass  homogeneous  coordinates  of  this  point  are 

(6)  *i  =  S(a),     x*  =  S(b),     x3  =  C(d). 

The  methods  of  §  2-5  are  those  by  means  of  which  I  have  founded,  on 
the  basis  of  the  endcalcuhis  of  Hilbert,  the  analytic  geometry  of  the 
hyperbolic  plane.  In  this  way  I  have  laid  the  foundation  for  a  completely 
elementary  and  at  the  same  time  independent  construction  of  hyperbolic 
plane  geometry. 

It  is  not  difficult  indeed,  on  the  basis  of  the  above  exposition,  to  intro- 
duce the  homogeneous  coordinates  of  points  at  infinity  (viz.  ends)  and  that 
of  ideal  points,  further  to  define  the  concept  of  the  ideal  line  and  that  of 
line  at  infinity  analytically.  The  identity  of  hyperbolic  plane  geometry 
with  the  well-known  circle-model  of  Klein-Hilbert  [5],  [4,  p.  38]  emphatic- 
ally independent  of  continuity,  is  already  a  consequence  of  this  analytic 
geometry. 

To  conclude  we  may  mention  that,  by  a  result  in  J.  C.  H.  Gcrrctsen  [1J, 
the  axiom  on  the  intersection  of  two  circles  can  be  derived  from  the 
axioms  of  the  hyperbolic  plane  referred  to  at  the  beginning  of  this 
discussion.  The  analytic  geometry  of  the  hyperbolic  plane  outlined  in 
the  present  paper  provides  a  new  proof  of  this  result  (cf.  Szasz  [13]). 


Bibliography 

1 1]     GERKETSEN,  J.  C.  IT.,  Die  Begriindiing  der  Trigonometric  in  der  hyperbolischen 
ICbcne.    Konmklijke    Nederlandsche    Akademie    van    Wctenschappen,    Pro- 
ceedings of  the  Section  of  Sciences,  vol.  45  (1942),  pp.  360  -366,  479-483,  559 
566. 

[2] ,  Zur  hyperbolischen  Geometric.   Konmklijke    Nederlandsche    Akademie 

van  Wetenschappen,  Proceedings  of  the  Section  of  Sciences,  vol.  45  (1942),  pp. 
567-573. 

[3]     HILBERT,    I).,    Neite    Begrunding    der    Rolyai-Lobatschefskyschen    Geometrie. 
Mathcniatische  Annalcn,  vol.  57  (1903),  pp.  137-150. 

[4]    • ,  Grundlagen  der  Geometrie  7.  Aufl.,  Leipzig  and  Berlin  1930,  pp.  159-177. 


HOMOGENEOUS    COORDINATES    IN    HYPERBOLIC    GEOMETRY         113 

[5]  KLEIN,  F.,  Ober  die  Sogenannte  Nicht-Euklidische  Geometrie.  Mathematische 
Annalen,  vol.  4  (1871),  pp.  583-625,  spec.  pp.  620-621,  (reprinted  in  Gesam- 
melte  Mathematische  Abhandlungen  I.  Berlin  1921,  pp.  254-305,  spec.  300- 
301.) 

[6]  LIEBMANN,  H.,  Uber  die  Begriindung  der  hyperbolischen  Geometrie.  Mathema- 
tische Annalen,  vol.  59  (1904),  pp.  110-128. 

[7]  SZASZ,  PAUL,  A  Poincare"-fele  felsik  es  a  hiperbolikus  sikgeometria  kapcsolatdrol 
(in  Hungarian).  A  Magyar  Tudomdnyos  Akademia  III.  Osztalyanak  Kozle- 
menyci,  vol.  6  (1956),  pp.  163-184. 

[8] ,  A  remark  on  Hubert's  foundation  of  the  hyperbolic  plane  geometry.  Acta 

Mathematica  Acadcmiac  Scicntiarum  Hungaricac,  vol.  9  (1958).  pp.  29—31. 

[9] ,  Begrundung  der  analytischen  Geometrie  der  hyperbolischen  Ebene  mit  den 

klassischen  Hilfsmitteln,  unabhdngig  von  der  Trigonometrie  dieser  Ebene.  Acta 
Mathematica  Academiae  Scicntiarum  Hungaricae,  vol.  8  (1957),  pp.  139-157. 

[10] ,  Die  hyperbolise  he  Trigonometrie  als  Folge  der  analytischen  Geometrie  der 

hyperbolischen  Ebene.  Acta  Mathematica  Academiae  Scicntiarum  Hungaricae, 
vol.  8  (1957),  pp.  159-161. 

[11] ,    Ober  die  Hilbertsche  Begrundung  der  hyperbolischen    Geometrie.    Acta 

Mathematica  Academiae  Scientiarum  Hungaricae,  vol.  4  (1954),  pp.  243—250. 

[12]     ,  Unmittelbare  Einfiihrung  Weierstrasscher  homogenen  Koordinaten  in  der 

hyperbolischen  Ebene  auf  Grund  der  Hilbertschen  Endenrechnung,  Anhang.  Acta 
Mathematica  Academiae  Scientiarum  Hungaricae,  vol.  9  (1958),  pp.  1-28, 
spec.  26- -28. 

[13] ,   New  proof  of  the  circle  axiom  for  two  circles  in  the  hyperbolic  plane  by 

means  of  the  endcalciilm  of  Hilbert.  Annales  Universitatis  Scientiarum  Buda- 
pestinensis  de  Rolando  Eotvos  nominatae,  vol.  1  (1958),  pp.  97-100. 


Symposium  on  the  Axiomatic  Method 


AXIOMATISCHER  AUFBAU  DER  EBENEN  ABSOLUTEN  GEOMETRIE 

FKIEDRICH  BACHMANN 

Mathematisches  Seminar,   Christian- Albrechts-Universitdt,  Kiel,  Deutschland 

1.  Absolute  Geomctrie  soil  im  Sinnc  von  J.  BOLYAI  als  gemeinsamcs 
Fundament    der    cuklidischen    und    der    nichteuklidischen    Geometrien 
verstanden  werden.  Die  Parallelenfrage,  d.h.  die  Frage  nach  dem  Schnei- 
den  oder  Nichtschneiden  der  Geraden,  wird  offen  gelassen. 

Der  Aufbau  der  ebenen  absoluten  Geometric,  der  hier  skizziert  werden 
soil,  besitzt  besonderes  Interessc  durch  die  methodische  Verwendung 
der  Spiegelungen.  Anordenbarkeit  und  freie  Beweglichkeit  werden  nicht 
gefordert.  Der  Begriff  der  absoluten  Geometrie  wird  so  allgemein  gefasst, 
dass  iiber  alien  Korpern  von  Charakteristik  =|=  2,  in  wclchen  nicht  jecles 
Element  Quadrat  ist,  Modelle  konstruiert  werden  konnen. 

2.  Gegebcn  sei  zunachst  eine  Menge  von  Punkten  und  eine  Menge  von 
Geraden,  und  ferncr  eine  Inzidenz  von  Punkt  und  Gerade  und  ein   Senk- 
rechtstehen  von  Geraden,  so  dass  die  f olgcnden  Axiome  gclten : 

INZIDRXZAXIOME.  Es  gibt  wenigstens  eine  Gerade,  itnd  mit  jeder  Ge- 
raden inzidiercn  wenigstens  drei  Pnnkte.  Zu  zwei  verschiedenen  Punkten 
gibt  es  genau  eine  Gerade,  welche  mit  beiden  Punkten  inzidiert. 

ORTHOGONALITATSAXIOME.  Ist  a  senkrecht  zu  b,  so  ist  b  senkrecht  zu  a. 
Senkrecht  e  Geraden  haben  einen  Punkt  gemein.  Durch  jeden  Punkt  gibt  es  zu 
jeder  Geraden  eine  Senkrechte,  und  wenn  der  Punkt  mit  der  Geraden  inzi- 
diert, nur  eine. 

SPIEGELUNGSAXIOM  (SCHUTTE).  Zu  jeder  Geraden  g  gibt  es  wenigstens 
eine  Spiegelung  an  g,  d.h.  eine  involutorische  orthogonalitatserhaltende 
Kollineation,  welche  alle  Punkte  von  g  festlasst. 

(Eine  eineindeutigc  Abbildung  einer  Menge  auf  sich  wird  involutorisch 
genannt,  wenn  sie  ihrer  Umkehr-Abbildung  gleich,  aber  von  der  identi- 
schen  Abbildung  verschieden  ist). 

Produkte  von  Geradenspiegelungen  nennen  wir  Bewegungen. 

Aus  diesen  Axiomen  f olgt :  Den  Geraden  a  entsprechcn  eineindeutig  die 
Spiegelungen  aa  an  den  Geraden,  den  Punkten  A  entsprechen  einein- 

114 


AUFBAU    DER   EBENEN    ABSOLUTEN    GEOMETRIE  115 

deutig  die  Punktspiegelungen  a  A,  welche  dual  zu  den  Geradenspiegelungen 
erklart  seien.  Ferner  gilt: 

A,  b  sind  inzident     ist  aquivalent  mit    GA^I)  ist  involutorisch. 

a,  b  sind  senkrecht    ist  aquivalent  mit    oa<fb  ist  involutorisch. 

Indem  man  die  Punkte  und  Geraden  durch  die  Punktspiegelungen  und 
Geradenspiegelungen,  und  ferner  die  gegebenen  Relationen  Inzidenz  und 
Senkrechtstehen  durch  die  aquivalenten  Relationen  zwischen  den  Spie- 
gelungen  ersetzt,  erhalt  man  daher  in  das  Bewegungsgruppe  ein  iso- 
morphes  Abbild  der  gegebenen  geometrischen  Struktur.  Dies  gestattet, 
geometrische  Satze  als  Aussagen  iiber  Spiegelungen  zu  formulieren  und 
durch  gruppentheoretisches  Rechnen  mit  Spiegelungen  zu  beweisen.  Der 
Anwendung  einer  Geradenspiegelung  ag  auf  die  Punkte  und  Geraden 
entspricht  in  dem  isomorphen  Abbild  das  gruppentheoretische  Trans- 
formieren  aller  Punkt-  und  Geradenspiegelungen  mit  der  Geraden- 
spiegelung ag. 

3.  Als  Satz  von  den  drei  Spiegelungen  bezeichnen  wir  die  Aussage:  Das 
Produkt  der  Spiegelungen  an  drei  Geraden  a,  b,  c,  welche  mit  einem  Punkt 
inzidicren  oder  auf  einer  Geraden  senkrecht  stehen,  ist  gleich  der  Spiegelung 
an  einer  Geraden  d. 

Eine  Gesamtheit  von  Punkten  und  Geraden,  fur  die  die  oben  genannten 
Axiome  und  der  Satz  von  den  drei  Spiegelungen  gelten,  werde  als  me- 
trische  Ebene,  und  die  Theorie  dieser  metrischen  Ebenen  als  ebene  absolute 
Gcometrie  bezeichnct. 

4.  Fiir  den  Aufbau  der  ebenen  absoluten  Geometric  verwenden  wir 
—  cntsprechend  den  Uberlegungen  in  2  —  statt  der  bisher  genannten 
Axiome  ein  Axiomensystem,  welches  die  Bewegungsgruppen  der  metri- 
schen Ebenen  charakterisiert. 

Wir  fiihren  zunachst  einige  gruppentheoretische  Bezeichnungen  ein. 
Es  sei  eine  beliebige  Gruppe  gegeben.  Sind  a,  y  Gruppenelemente,  so 
bezeichnen  wir  das  Element  y~1ay,  das  aus  a  durch  Transformation  mit  y 
hervorgeht,  mit  a*.  Es  ist  (a/J)y  =  a^  und  afv  =  (of)1?.  Eine  Menge  von 
Gruppenelementen  nennen  wir  invariant,  wenn  sie  gegen  das  Transfor- 
mieren  mit  beliebigen  Gruppenelementen  abgeschlossen  ist. 

Es  seien  p,  a  involutorische  Gruppenelemente.  Besteht  fiir  sie  die 
Relation 

(1)  per  ist  involutorisch, 

so  schreiben  wir  hierfiir  abkiirzend  p\a.  Offenbar  ist  (1)  aquivalent  mit 
pa  =  ap  und  p  4=  o.  Wir  schreiben  pi,. .  .,pmki>-  -  -,<Jn  als  Abkiirzung 


116  FRIEDRICH    BACHMANN 

fur  die  Konjunktion  der  Aussagen  pt\ajc  (i—\,  . . .,  m\  k  =  1,  . . .,  n). 

5  (Gruppentheoretisches  Axiomensystem  der  ebenen  absoluten  Geo- 
metrie) . 

GRUNDANNAHME.  Es  set  ein  aits  involntorischen  Elementen  bestehendes, 
invariantes  Erzengendensystem  S  einer  Gruppe  G  gegeben. 

Die  Elemente  von  5  seien  mil  kleinen  lateinischen  Buchstaben  be- 
zeichnet.  Die  involutorischen  Elemente  aus  G,  welche  als  Produkt  von 
zwei  Elementen  aus  5  darstellbar  sind,  seien  mit  grossen  lateinischen 
Buchstaben  (ausser  G,  H,  S)  bezeichnet. 

AXIOM  1.  Zu  A,  B  gibt  es  stcts  ein  c  mit  A,B\c. 

AXIOM  2.  Aus  A,B\c,d  folgt  A  =  B  oder  c  =  d. 

AXIOM  3.  Gilt  a,b,c\E,  so  gibt  es  ein  d,  so  dass  abc  =  d  ist. 

AXIOM  4.  Gilt  a,b,c  e,  so  gibt  es  ein  d,  so  dass  abc  =  d  ist. 

AXIOM  5.  Es  gibt  a,  b,  c  derart,  dass  a\b  und  wedcr  c\a  nock  c\b  noch  c\ab 
gilt. 

Dies  Axiomensystem  ist  einc  reduzierte  Fassung  cines  von  ARNOLD 
SCHMIDT  angegcbenen  Axiomensystems. 

6  (Gruppencbene) .     Ist  Gm  Bewegungsgruppe  ciner  metrischen  Ebenc, 
und  Sm  die  Menge  der  Geradenspiegelungen,  so  geniigt  das  Paar  Gm,  Sm 
dem  gruppentheoretischen  Axiomensystem. 

Umgekehrt  lasst  sich  jedem  Paar  G,  S,  welches  dem  gruppentheoreti- 
schen Axiomensystem  geniigt,  einc  metrische  Ebene  durch  die  folgende 
Konstruktion  der  Gntppenebene  zu  G,  S  zuordnen : 

Die  Elemente  a,  b,  ...  (die  Elemente  aus  S)  werden  Geraden,  die  Ele- 
mente A,  B,  ...  Punkte  der  Gntppenebene  genannt.  Zwei  Geraden  a  und  b 
der  Gruppenebene  nennen  wir  zueinander  senkrecht,  wenn  a\b  gilt.  (Die 
Punkte  sind  also  die  Gruppenelemente,  welche  sich  als  Produkt  von  zwei 
senkrechten  Geraden  darstellen  lassen.)  Einen  Punkt  A  und  eine  Gerade  b 
der  Gruppenebene  nennen  wir  inzident,  wenn  A\b  gilt.  Axiom  1  besagt, 
dass  es  zu  zwei  Punkt  en  stets  eine  Verbindungsgerade  gibt.  Axiom  2 
besagt,  dass  zwei  verschiedene  Punkte  hochstens  eine  Verbindungs- 
gerade besitzen.  Axiom  5  spricht  eine  Mindest-Existenzforderung  aus  und 
besagt,  dass  es  zwei  senkrecht e  Geraden  a,  b  und  eine  Gerade  c  gibt,  wel- 
che weder  zu  a  noch  zu  b  senkrecht  ist  und  auch  nicht  mit  dem  Punkt  ab 
inzidiert. 

Wir  definieren  weiter:  Drei  Geraden  a,  b,  c  der  Gruppenebene  liegen  im 


AUFBAU    DER   EBENEN    ABSOLUTEN    GEOMETRIE  117 

Btischel,  wenn 

(2)  abc  e  S 

gilt.  1st  dies  der  Fall,  gibt  es  also  ein  d  mit  abc  =  d,  so  nennen  wir  d  die 
vierte  Spiegelungsgerade  zu  a,  b,  c.  Axiom  3  und  Axiom  4  besagen,  dass  drei 
Geraden,  welche  mit  einem  Punkt  inzidieren  oder  auf  einer  Geraden 
senkrecht  stehen,  im  Biischel  liegen. 

Durch  das  Axiomensystem  ist  zugelassen,  dass  es  in  S  Elemente  a,  b,  c 
gibt,  fur  die  abc  =  1  ist.  Dann  sind  die  Geraden  a,  b,  c  der  Gruppenebene 
paarweise  zueinander  senkrecht.  Wir  sagen,  dass  drei  solche  Geraden  ein 
Polardreiseit  bilden.  (Polardreiseite  treten  bekanntlich  in  elliptischen 
Ebenen  auf).  Ist  abc  =  1,  also  ab  =  c,  so  ist  ab  als  involutorisches  Pro- 
dukt  von  zwei  Elementen  aus  5  ein  Element  C ;  es  ist  also  dasselbe  Grup- 
penelement  sowohl  Punkt  als  Gerade  der  Gruppenebene.  Allgemein 
nennen  wir,  wenn  C  —  c  ist,  den  Punkt  C  und  die  Gerade  c  der  Gruppen- 
ebene zueinander  polar.  Ist  dies  der  Fall,  so  ist  jede  Gerade,  welche  mit 
dem  Punkt  C  inzidiert,  zu  der  Geraden  c  senkrecht  und  umgekehrt;  ist 
namlich  C  —  c,  so  gilt  fiir  alle  x:  Aus  C\x  folgt  c\x,  und  umgekehrt. 

Aus  den  Axiomen  folgt : 

EXISTENZ  DER  SENKRECHTEN.  Zu  A ,  b  gibt  es  stets  ein  c  mit  A,b\c,  d.h. 
durch  jeden  Punkt  gibt  es  zu  jeder  Geraden  eine  Senkrechte. 

ElNDEUTIGKEIT     DER     SENKRECHTEN.       Al4S    A,b\C,d    folgt    A   =  b    Oder 

c  =  d,  d.h.  sind  A,  b  nicht  zueinander  polar,  so  gibt  es  durch  A  nur  eine 
Senkrechte  zu  b.  Sind  insbesondere  A,  b  inzident,  so  ist  das  in  A  auf  b 
errichtete  Lot  eindeutig  bcstimmt  und  gleich  Ab. 

Die  Spiegelung  der  Gruppenebene  an  einer  Geraden  c  ist  die  Abbildung 

(3)  x*  =  x*9     X*  =  X*. 

Auf  Grund  der  Axiome  3  und  4  gilt  fiir  die  Spiegelungen  (3)  der  Satz  von 
den  drei  Spiegelungen.  Die  Bewegungen  der  Gruppenebene  sind  die  Ab- 
bildungen : 

(4)  x*  =  *y,     X*  =  X?     mit  y  e  G, 

also  die  inneren  Automorphismen  von  G,  angewendet  auf  die  Menge  der 
Geraden  und  die  Menge  der  Punkte. 

Die  Bewegungen  (4)  der  Gruppenebene  bilden  eine  Gruppe  G*,  welche 
von  der  Menge  S*  der  Spiegelungen  (3)  an  den  Geraden  der  Gruppenebene 
erzeugt  wird.  Das  Zentrum  von  G  besteht  nur  aus  dem  Einselement.  Das 


118  FRIEDRICH    BACHMANN 

Paar  G*,  5*  ist  eine  Darstellung  des  axiomatisch  gegebenen  Paarcs  G,  5. 

7.  Satze  der  absoluten  Geometrie  werden  nun  durch  gruppentheore- 
tisches  Rechnen  mil  den  involutorischen  Elementen  a,  b,  ...  und  A,  B, . . . 
bewiesen.  Es  gibt  mancherlei  einfache  Beweise  dieser  Art. 

Als  Beispiel  betrachten  wir  den  Satz  von  der  isogonalen  Verwandtschaft 
in  bezug  auf  ein  Dreiseit  a,  b,  c.  Er  kann  folgendermassen  formuliert 
werden :  Sind  a',  b' ,  c'  Geraden,  welchc  im  Biischel  liegen,  und  liegen  b,  a',  c 
sowie  c,  b',  a  sowie  a,  c',  b  im  Biischel,  so  liegen  auch  die  vierten  Spie- 
gel ungsgeraden  ba'c  =  a",  cb'a  —  b" ,  ac'b  =  c"  im  Biischel. 

SATZ      VON      DER      ISOGONALEN      VERWANDTSCHAFT.       A  US      ba'c  —  a", 

cb'a  =  b",  ac'b  =  c"  und  a'b'c'  e  5  folgt  a"b"c"  e  S. 

BEWEIS.  Es  ist  a"b"c"  =  ba'c- cb'a- ac'b  =  (a'b'c')».  Aus  a'b'c' eS 
folgt  (a'b'c')b  E  S,  wegen  der  Invarianz  von  S,  und  damit  a"b"c"  e  S. 

Zu  der  dreistelligen  Relation  (2),  durch  die  das  Im-Buschel- Liegen  von 
Geraden  erklart  ist,  bemerken  wir: 

Wegen  der  Invarianz  von  S  ist  die  Relation  (2)  reflexiv  und  symmetrisch 
in  dem  folgenden  Sinne:  Fur  Elemente  a,  b,  c,  die  nicht  samtlich  ver- 
schieden  sind,  gilt  (2)  stets;  gilt  (2)  fur  Elemente  a,  b,  c,  so  auch  fiir  jede 
Permutation  von  a,  b,  c.  Aus  dem  Axiomensystem  der  absoluten  Geo- 
metrie folgt,  dass  die  Relation  (2)  auch  transitiv  ist,  d.h.  der 

TRANSITIVITATSSATZ.     Aus  a  4=  b  und  abc,  abd  e  S  folgt  acd  £  5. 

Niitzlich  fiir  das  Beweisen  in  der  absoluten  Geometrie  sind  Lemmata 
iiber  nicht  notwendig  involutorische  Elemente  aus  G,  wie  die  folgenden : 

LEMMA  VON  THOMSEN.  a  und  ft  seien  Elemente  aus  G,  welche  als  Pro- 
dukte  einer  ungeraden  Anzahl  von  Elementen  aus  S  darstellbar  sind.  Ist 
a  =|=  l  und  oft  =  a"1,  so  liegt  a  oder  ft  in  S. 

\ft\    fa    fa     LEMMA  VON  DEN  NEUN  INVOLUTORISCHEN  PRODUKTEN. 

a o       o       ~   $ind  *i>  Pic^G  (i,  k  =  1,  2,  3)  und  OL\  4=  «2,  fti  =t=  fa>  so 

a  gilt:  Steht  an  den  acht  mit  °  bezeichneten  Stellen  der  Pro- 

a      o       o       #      dukttafel  der  a^^  ein  Element  aus  S,  so  auch  an  der  mit  * 

bezeichneten  Stelle. 

Aus  dem  Lemma  von  THOMSEN  erhalt  man  durch  Einsetzung  z.B.  den 
Hohensatz,  aus  dem  Lemma  von  den  neun  involutorischen  Produkten 
z.B.  den  HESSENBERGschen  Gegenpaarungssatz  (Vierseitsatz),  mit  dem 
sich  der  Satz  von  PAPPUS  gewinnen  lasst. 

Als  Beispiel  sei  etwa  der  Beweis  des  Hohensatzes  hier  ausgefiihrt.  Wir 


AUFBAU    DER   EBENEN    ABSOLUTEN    GEOMETRIE  119 

betrachten  ein  Dreiseit,  welches  kein  Polardreiseit  ist  und  dessen  Seiten 
nicht  im  Biischel  liegen.  Unter  einer  Hohe  verstehen  wir  eine  Gerade, 
welche  auf  einer  Seite  des  Dreiseits  senkrecht  steht  und  mit  den  beiden 
anderen  Seiten  im  Biischel  liegt. 

HOHENSATZ.  Ist  abc  4=  1  und  abc  $  S  und  gilt: 

(5)  u\a,     v\b,     w\c, 

(6)  bcu,  cav,  abw  e  S, 
so  ist  uvw  £  S. 

BKWEIS.  Nach  der  ersten  Voraussetzung  (5)  ist  ua  =  au,  also  au  =  a, 
und  nach  der  ersten  Voraussetzung  (6)  bcu  —  ucb,  also  (bc)u  —  cb,  ins- 
gesamt  also  (abc)u  —  au(bc)u  —  acb.  Indem  man  den  Schluss  wiederholt, 
erhalt  man 

(abc)uvw  =--  (au(bc)u)vw  =  (acb)vw  =  ((ac)vbv)w  =  (cab)w  =  cw(ab)w=cba. 

Es  ist  also 

(abc)u™  =  (abc)-1, 

und  hieraus  folgt  wcgen  abc  =h  1  und  abc  $  S  nach  dem  Lemma  von 
THOMSEN  die  Behauptung. 

8  (Geradenbiischel) .     Da  die  dreistellige  Relation  (2),  wie  in  7  bemerkt, 
reflexiv,  symmetrisch  und  transitiv  ist,  definiert  sic  in  S  Teilmengen  mit 
den  Eigenschaften :  1)  Fur  je  drei  Elemente  a,  b,  c  einer  Teilmenge  gilt 
(2) ;  2)  Besteht  zwischen  zwei  verschiedenen  Elementen  a,  b  einer  Teil- 
menge und  einem  Element  c  die  Relation  (2),  so  gehort  auch  c  der  Teil- 
menge an;  3)  Zu  jc  zwei  Elementen  a,  b  gibt  es  eine  Teilmenge,  der  sie 
angehoren.  Aus  1),  2),  3)  folgt:  Je  zwei  verschiedene  Teilmengen  haben 
hochstens  ein  Element  gemein. 

Diese  durch  die  Relation  (2)  definierten  Teilmengen  der  Menge  aller 
Geraden  nennen  wir  Geradenbiischel .  Je  zwei  verschiedene  Geraden  a,  b 
bestimmen  ein  Geradenbiischel;  es  besteht  aus  alien  Geraden  c,  die  mit 
a,  b  im  Biischel  liegen. 

Alle  Geraden,  welche  mit  einem  gegebenen  Punkt  A  inzidieren,  bilden 
ein  Geradenbiischel,  das  wir  mit  G(A)  bezeichnen.  Solche  Geradenbiischel 
nennen  wir  eigentliche  Geradenbiischel. 

Alle  Geraden,  welche  auf  einer  gegebenen  Geraden  a  senkrecht  ste- 
hen,  bilden  ein  Geradenbiischel,  das  Lotbiischel  zu  a,  das  wir  mit  G(a) 
bezeichnen. 

9  (Halbdrehungen).     Ein  weiteres   Hilfsmittel  fur   Oberlegungen  in 


120  FRIEDRICH    BACHMANN 

der  absoluten  Geometric  sind  gewisse  Abbildungen,  welche  keine  Bewe- 
gungen  sind,  namlich  die  von  HJELMSLEV  eingefiihrten  Halbdrehungen. 
Jedes  Element  a  aus  G,  welches  als  Produkt  einer  ungeraden  Anzahl 
von  Element  en  aus  S  darstellbar  ist,  lasst  sich  in  der  Form 

(7)  abc     mit    a\b,c 

darstellen.  Ist  a  +  1  >  so  bestimmt  a  das  in  der  Darstellung  (7)  auftretende 
Element  a,  das  wir  mit  [a]  bezeichnen,  eindeutig. 

Es  sei  nun  y  ein  nicht-involutorisches  Element  aus  G,  welches  als  Pro- 
dukt von  zwei,  mit  einem  Punkt  0  inzidierenden  Geraden  darstellbar  ist. 
Durch  x  ->  [xy]  wird  eine  eineindeutige  Abbildung  der  Menge  der  Ge- 
raden der  Gruppenebene  in  sich  definiert.  Diese  Abbildung  nennen  wir  die 
Halbdrehung  um  0,  welche  zu  dem  Gruppenelement  y  gehort,  und  bezeich- 
nen sie  mit  Hy.  Es  ist  also 

(8)  xHy  -  [xy]. 

Die  Halbdrehungen  sind  biischeltrcu:  Liegen  drei  Geraden  im  Biischel, 
so  liegen  auch  ihre  Bildgeraden  im  Biischel,  und  umgekehrt.  Insbesondere 
bildet  jede  Halbdrehung  um  0  die  Menge  der  Geraden  durch  0  einein- 
deutig  auf  sich  ab.  Senkrechte  Geraden  werden  im  allgemeinen  nicht  in 
senkrechte  Geraden  ubergehen,  wohl  aber  dann,  wenn  eine  der  beiden 
Geraden  durch  0  gcht. 

Jede  Halbdrehung  induziert  eine  eineindeutige  Abbildung  der  Menge 
der  Geradenbiischel  auf  sich;  dabei  wird  die  Menge  der  eigentlichen 
Geradenbiischel  in  sich  abgebildet.  Wir  nennen  auch  diese  Abbildung  der 
Geradenbiischel  eine  Halbdrehung  und  werden  sie  mit  dcm  gleichen 
Symbol  bezeichnen,  wie  die  Halbdrehung  der  Geraden,  durch  die  sie 
induziert  wird.  Die  Menge  der  Lotbiischel  der  Geraden  durch  0  wird 
durch  jede  Halbdrehung  um  0  auf  sich  abgebildet ;  jedes  andere  Geraden- 
biischel kann  durch  eine  geeignete  Halbdrehung  um  0  in  ein  eigentliches 
Geradenbiischel  iibergefuhrt  werden. 

10.  Aus  gewissen  Satzen  der  absoluten  Geometrie  entstehen  bei  ge- 
wissen  Ersetzungen  von  Punkten  durch  Geraden  oder  von  Geraden 
durch  Punkte  wieder  richtige  Satze  der  absoluten  Geometrie.  Ein  Bei- 
spiel  fur  diese  ,,Punkt-Geraden-Analogie",  auf  die  ARNOLD  SCHMIDT  auf- 
merksam  gemacht  hat,  sind  Axiom  3  und  Axiom  4 ;  weitere  Beispiele  sind : 


Zu  A,  B  gibt  es  stets  ein  c  mit  A,B\c 
(Existenz    der    Verbindungsgera- 
den), 


Zu  A,  b  gibt  es  stets  ein  c  mit  A,b\c 
(Existenz  der  Senkrechten). 


AUFBAU  DER  EBENEN  ABSOLUTEN  GEOMETRIE 


121 


Aus  A,B\c,d  folgt  A  =  B  oder  c=d 
(Eindeutigkeit  der  Verbindungsge- 
raden) . 


Aus  A,b\c,d  folgt  A  =  b  oder  c  —  d 
(Eindeutigkeit  der  Senkrechten). 


Ersetzt  man  in  den  rechts  stehenden  Satzen  auch  den  Punkt  A  durch 
eine  Gerade  a,  so  erhalt  man  die  Aussagen 

V    Zu  a,  b  gibt  es  stets  ein  c  mil  a,b\c, 

d.h.  je  zwei  Geraden  haben  ein  gemeinsames  Lot,  und 

^R    Aus  a,b\cyd  folgt  a  —  b  oder  c  =  d, 

d.h.  zwei  verschiedene  Geraden  haben  hochstens  ein  gemeinsames  Lot. 
Die  Aussage  ~R  ist  die  Negation  der  Aussage 

R    Es  gibt  a,  b,  c,  d  mil  a,b[c,d  und  a  ^=  b  und  c  =4=  d, 

welche  besagt,  dass  ein  Rechtseit  existiert. 

Keine  von  den  Aussagen  V,  • — 'R,  R  ist  aus  den  Axiomen  der  absoluten 
Geometric  beweisbar.  Man  kann  jede  von  ihnen  als  ein  Zusatzaxiom  zu 
dem  Axiomensystem  aus  5  hinzufiigen  und  so  Spezialfalle  der  absoluten 
Geometric  definieren.  Die  Aussage  V  ist  mit  der  Existcnz  von  Polar- 
drciseiten  Equivalent  und  definiert  die  elliptische  Geometric  im  Rahmen 
unseres  Axiomensystems  der  absoluten  Geometric.  Die  Aussage  R  nennen 
wir  das  Axiom  der  euklidischen  Metrik,  die  Aussage  ^R  das  Axiom  der 
nichteuklidischen  Metrik.  Die  Zusatzaxiome  R  und  ^R  fiihren  zu  der 
Gabelung  der  absoluten  Geometric  in  die  Geometric  mit  euklidischer  Metrik 
und  die  Geometric  mit  nichteuklidischer  Metrik.  Aus  V  folgt  /-^R. 

Ein  allgcmeincs  Theorem,  welches  den  Umfang  der  in  der  absoluten 
Geometric  crlaubten  Ersetzungen  von  Punkten  durch  Geraden  und  von 
Geraden  durch  Punkte  beschreibt,  ist  nicht  bekannt.  Jedoch  sind  in  der 
durch  das  Zusatzaxiom  V  definierten  elliptischcn  Geometric  beliebige 
Ersetzungen  dieser  Art  erlaubt. 

11  (Projektiv-metrische  Ebenen).  Unter  einer  projektiven  Ebene  ver- 
stehcn  wir  eine  Menge  von  Punkten  und  Geraden,  in  der  die  projektiven 
Inzidenzaxiome,  der  Satz  von  PAPPUS  und  das  FAN  o- Axiom  gelt  en. 

Eine  projektive  Ebene,  in  der  eine  Gerade  als  ,,unendlichferne"  Gerade 
goo  und  auf  ihr  eine  projektive  fixpunktfreie  Involution  als  "absolute" 
Involution  ausgezeichnet  ist,  nennen  wir  eine  singuldre  projektiv-metrische 
Ebene.  Jeder  Geraden  a  =(=  goo  ordnen  wir  einen  Pol  zu,  namlich  den  auf 
goo  liegenden  Punkt,  welcher  dem  Schnittpunkt  von  a,  g^  in  der  absoluten 
Involution  entspricht. 


122  FRIEDRICH    BACHMANN 

Eine  projektive  Ebene,  in  der  eine  projcktive  Polaritat  als  ,,absolute" 
Polaritat  ausgczcichnet  1st,  ncnncn  wir  eine  ordindre  profektiv-metrische 
Ebene. 

Es  sei  nun  c  eine  Gerade  eincr  gegebenen  projektiv-metrischen  Ebene; 
die  Gerade  c  sei  im  singularen  Fall  von  g^  verschieden  und  im  ordinaren 
Fall  riicht  mit  ihrcm  Pol  inzidcnt.  Dann  nennen  wir  die  harmonische  Ho- 
mologie,  deren  Achse  die  Gerade  c  und  deren  Zcntrum  der  Pol  von  c  ist, 
die  Spiegelitng  der  projektiv-metrischen  Ebene  an  der  Geraden  c.  Die  von 
der  Mcngc  Spm  dieser  Spicgelungen  an  Geraden  der  projektiv-metrischen 
Ebene  erzeugte  Gruppe  Gpm  nennen  wir  die  Bewegungsgruppe  der  pro- 
jektiv-metrischen Ebene. 

12  (Idealebene).  Die  Gruppenebene  zu  G,  S  lasst  sich  durch  Ein- 
fiihrung  von  idealen  Elementen  zu  einer  projektiv-metrischen  Ebene 
erweitern. 

Man  nennt  hierzu  die  Geradenbiischel  Idealpunkte,  und  die  eigentlichen 
Geradenbiischel  eigentliche  Idealpunkte.  Die  Menge  aller  Geradenbiischel, 
welche  cine  Gerade  a  gemein  haben,  bezeichnet  man  als  die  eigentliche 
Idealgerade  g(a). 

Urn  den  Begriff  der  Idealgeraden  allgemcin  zu  definieren,  verwenden 
wir  die  Halbdrehungen,  die  es  ermoglichen,  ,,Uneigentliches"  in  ,,Eigent- 
liches"  uberzufiihren  (vgl.  9). 

Wir  wahlen  einen  Punkt  0  der  Gruppenebene,  den  wir  fort  an  fest- 
halten.  Eine  Halbdrehung  Hy  um  0  fiihrt  jede  eigentliche  Idealgerade  in 
eine  eigentliche  Idealgerade  iiber;  denn  es  ist 

(9)  g(a)Hy  =  g(aHy). 

Die  Menge  der  Lotbiischel  der  Geraden  durch  0,  die  bei  jeder  Halb- 
drehung um  0  in  sich  iibergeht,  bezeichnen  wir  mit  g(0). 

Eine  Menge  a  von  Idealpunkten  wird  nun  eine  Idealgerade  genannt, 
1)  wenn  es  eine  Halbdrehung  Hy  um  0  gibt,  so  dass  aHy  eine  eigentliche 
Idealgerade  ist,  und  ferner  2)  wenn  a  =  g(0)  ist. 

Man  beweist  dann,  dass  die  Idealpunkte  und  Idealgeraden  eine  pro- 
jektive Ebene  bilden,  die  Idealebene  zu  G,  S.  Die  eigentlichen  Idealpunkte 
und  die  eigentlichen  Idealgeraden  bilden  eine  zu  der  Gruppenebene  iso- 
morphe  Teilebene  dei  Idealebene. 

Es  ist  nun  zu  zeigen,  dass  die  in  der  Gruppenebene  erklarte  Orthogonali- 
tat  in  der  Idealebene  projektiv-metrische  Relationen  induziert. 

Wir  nehmen  zunachst  an,  dass  in  der  Gruppenebene  das  Axiom  der 
euklidischen  Metrik  gilt.  Dann  sind  je  zwei  Geraden,  welche  ein  gemein- 


AUFBAU    DER   EBENEN    ABSOLUTEN    GEOMETRIE  123 

sames  Lot  haben,  zueinandcr  lotgleich,  d.h.  es  ist  jedes  Lot  der  einen 
Geradcn  auch  Lot  der  anderen  Geraden.  Daher  ist  jedes  Lotbiischel  auch 
Lotbiischcl  einer  Geraden  durch  einen  fest  gewahlten  Punkt.  Die  Menge 
allcr  Lotbiischel  ist  also  eine  Idealgerade,  die  wir  mit  g^  bezeichnen. 

Gibt  es  in  einem  Lotbiischel  eine  Gerade,  wclche  zu  einer  Geraden 
eines  anderen  Lotbiischels  orthogonal  ist,  so  ist  jede  Gerade  des  einen 
Lotbiischels  zu  jeder  Geraden  des  anderen  Lotbiischels  orthogonal.  Es 
gibt  daher  cine  Orthogonalitat  der  Lotbiischel.  Sie  definiert  auf  der  aus- 
gezeichneten  Idealgeraden  g^  eine  projektive  fixelementfreie  Involution. 
Die  Idealebene  einer  Gruppenebcne  mit  euklidischer  Metrik  ist  also  eine 
singulare  projektiv-metrische  Ebene. 

Es  gelte  nun  in  der  Gruppenebene  das  Axiom  der  nichteuklidischen 
Metrik.  Dann  ist  jede  Gerade  nur  zu  sich  selbst  lotgleich,  und  die  Lot- 
biischel verschiedener  Geraden  sind  verschieden.  Ordnet  man  jeder 
eigentlichen  Idealgeraden  g(a)  den  Idealpunkt  G(a)  (das  Lotbiischel  von 
a)  als  Pol  zu,  so  ist  dies  jetzt  eine  eineindeutige  Zuordnung  zwischen  den 
eigentlichen  Idealgeraden  und  den  Lotbiischeln.  Um  diese  Zuordnung  zu 
einer  in  der  gesamten  Idealebene  erklarten  Polaritat  auszudehnen,  ver- 
wendcn  wir  wiederum  die  Halbdrehungen  um  den  Punkt  0.  Wendet  man 
zunachst  auf  eine  eigentliche  Idealgerade  g(a)  eine  Halbdrehung  Hy  um 
0  an,  so  entsteht  nach  (9)  die  eigentliche  Idealgerade  g(aHy).  Der  Pol 
G(a)  von  g(a)  wird  dabei  im  allgemeinen  nicht  wieder  in  den  Pol  G(aHy) 
von  g(aHy)  iibergehen.  Vielmehr  besteht  zwischen  G(a)  und  G(aHy)  der 
f olgende  allgemeine  Zusammenhang :  Es  ist 

(10)  G(a)H--^  =  G(aHv). 

Wir  nennen  jedes  Paar  g(a),  G(a)  ein  primitives  Polar e-Pol-Paar  und 
definieren  nun  fur  eine  Idealgerade  a  und  einen  Idealpunkt  A : 
a,  A  heissen  ein  Polare-Pol-Paar,  1)  wenn  es  eine  Halbdrehung  Hy  um 
0  gibt,  so  dass  aHy,  AH'1^  ein  primitives  Polare-Pol-Paar  sind,  und 
ferner  2}  wenn  a  =  g(0),  A  =  G(0)  ist. 

Man  beweist  nun,  dass  hiermit  in  der  Idealebene  eine  projektive  Pola- 
ritat erklart  ist.  Die  Idealebene  einer  Gruppenebene  mit  nichteuklidischer 
Metrik  ist  also  eine  ordinare  projektiv-metrische  Ebene. 

13.  Die  Spiegelung  (3)  der  Gruppenebene  an  einer  Geraden  c  induziert 
die  Spiegelung  der  projektiv-metrischen  Idealebene  an  der  eigentlichen 
Idealgeraden  g(c).  Die  Bewegungen  (4)  der  Gruppenebene  induzieren 
daher  Bewegungen  der  projektiv-metrischen  Idealebene.  Damit  ergibt 
sich  nun  das 


124  FRIEDRICH    BACHMANN 

HAUPTTHEOREM.  Jedes  Paar  G,  S,  welches  dem  gruppentheoretischen 
Axiomensystem  aus  5  geniigt,  Idsst  sich  als  Teilsystem  eines  Paares  Gpm, 
Spm  darstellen. 

Anders  gesagt:  Die  Bewegungsgruppen  der  metrischen  Ebenen  sind 
als  Untergruppen  von  Bewegungsgruppen  projektiv-metrischer  Ebenen 
darstellbar. 

14  (Metrische  Vektorraume  und  orthogonale  Gruppen).  Sei  Vs(K,  F) 
der  durch  eine  symmetrische  bilineare  Form  F  metrisierte  dreidimensio- 
nale  Vektorraum  iiber  einem  Korper  K  von  Charakteristik  =\^  2.  Wenn  in 
dem  metrischen  Vektorraum  Va(K,  F)  alle  isotropen  Vektoren  im  Radi- 
kal  liegen,  wird  die  Form  F  nullteilig  genannt. 

Die  eigentlich-orthogonale  Gruppe  0^+(K,  F)  wird  erklart  als  die  Gruppe 
aller  lincaren  Abbildungen  des  metrischen  Vektorraumes  V^(K,  F)  auf 
sich,  welche  den  Wert  von  F  erhalten  und  die  Determinante  1  haben. 
Unter  der  Spiegelung  des  metrischen  Vektorraumes  an  einem  nicht-isotropen 
eindimensionalen  Teilraum  T  verstehen  wir  die  involutorische  lineare 
Abbildung  des  metrischen  Vektorraumes  auf  sich,  welche  jeden  Vektor 
des  Teilraumes  T  festlasst  und  jeden  Vektor  des  orthogonalen  Komple- 
mcnts  von  T  in  den  entgegengesetzten  ubcrfiihrt.  Die  Menge  S$+(K,  F) 
aller  dieser  Spiegelungcn  des  metrischen  Vektorraumes  ist  ein  Erzeugcn- 
densystem  der  Gruppe  0%+(K,  F). 

15.  Jede  projektive  Ebene  kann  man  als  dreidimensionalen  Vektor- 
raum iiber  einem  Korper  K  von  Charakteristik  4=  2  darstellen,  indem 
man  die  Geraden  durch  die  eindimensionalen  und  die  Punkte  durch  die 
zweidimensionalen  Teilraume  des  Vektorraumes  darstellt.  Jede  projek- 
tiv-metrische  Ebene  kann  man  in  entsprechender  Weise  als  metrischen 
Vektorraum  V%(Kt  F)  darstellen ;  die  Form  F  ist  im  singularen  Fall  vom 
Rang  2  und  nullteilig,  im  ordinaren  Fall  vom  Rang  3.  Die  Spiegelungen 
der  projektiv-metrischen  Ebene  an  den  in  1 1  genannten  Geraden  lasscn 
sich  durch  die  Spiegelungen  des  metrischen  Vektorraumes  an  den  nicht- 
isotropen  eindimensionalen  Teilraumen  darstellen.  Fur  die  Bewegungs- 
gruppe  der  projektiv-metrischen  Ebene  gilt  daher:  Das  Paar  Gpm,  Spm 
kann  dargestellt  werden  durch  das  Paar 

(11)  0*+(K,F)t    Ss+(K,F). 

Das  Haupttheorem  gestattet  daher,  die  Gruppen,  welche  das  Axiomen- 
system der  absoluten  Geometric  erfullen,  —  anders  gesagt,  die  Bewe- 
gungsgruppen der  axiomatisch  gegebenen  metrischen  Ebenen  —  als 


AUFBAU    DER   EBENEN    ABSOLUTEN    GEOMETRIE  125 

Gruppen  von  orthogonalen  Transformationen  metrischer  Vektorraume 
darzustellen : 

HAUPTTHEOREM,  algebraische  Fassung.  Jedes  Paar  G,  S,  welches  dem 
Axiomensystem  aus  5  geniigt  y  ist  darstellbar  als  Teilsystem  eines  Paares  (11), 
wobei  der  Korper  K  von  Charakteristik  =(=  2  und  die  symmetrische  bilineare 
Form  F  vom  Rang  2  und  nullteilig  oder  vom  Rang  3  ist. 

16.  Umgekehrt  entstcht  nun  die  Frage,  welche  Teilsystcme  von 
solchen  Paaren  (11)  Modelle  des  Axiomensystems  aus  5  sind.  Hierzu  sei 
hicr  folgendes  gesagt : 

FALL  1 :  F  vom  Rang  2  und  nullteilig  (euklidische  Metrik).  In  dicsem 
Fall  geniigt  jedes  Paar  (11)  dcni  Axiomensystem.  Gibt  es  in  dem  durch 
F  metrisierten  Vektorraum  zwei  orthogonale  Einheitsvektoren,  so  lasseri 
sich  in  jedcm  Paar  (11)  alle  ,,/Aigeho'rigcn",  dem  Axiomensystem  genii- 
genden  Teilsysteme  algebraisch  beschreiben.  Dabei  spielt  dcr  von  den 
Elcmcnten  (1  +  c2)"1  mit  c  E  K  erzeugte  Teilring  von  K  eine  Rolle. 

FALL  2:  F  vom  Rang  3  und  nullteilig  (elliptische  Metrik).  Auch  in  diesem 
Falle  geniigt  jedes  Paar  (11)  dem  Axiomensystem.  Beispiele  von  ccliten 
Teilsystcmen,  welche  dem  Axiomensystem  geniigen,  sind  bekannt;  eine 
allgemeine  Charakterisierung  scheint  schwieriger  als  im  Fall  1. 

FALL  3:  F  vom  Rang  3  und  nicht  nullteilig  (hypcrbolische  Metrik).  In 
diesem  Fall  geniigt  kein  Paar  (11)  dem  Axiomensystem.  Jedoch  kann  es 
ein  echtes  Teilsystem  S  von  Ss+(K,  F)  geben,  so  dass  das  Paar  Os+(K,  F), 
S  dem  Axiomensystem  geniigt.  Wird  F  so  normiert,  dass  die  Determi- 
nante  von  F  diejenige  Quadratklasse  von  K  ist,  dcr  die  1  angehort,  so 
gilt:  Ist  K  geordnet,  und  5  die  Menge  der  Elemente  aus  Sa+(K,  F)  mit 
negativer  Norm,  so  geniigt  O%+(K,  F),  S  dem  Axiomensystem.  Es  sind 
alle  invarianten,  und  Beispiele  nicht-in variant er  Teilsysteme  dcr  Paare 
(11)  bekannt,  welche  dem  Axiomensystem  geniigen. 


Bibliographic 

BOLYAT,  J.,  Appendix.  Scientiam  spatii  absolute  veram exhibens :  a ve r itate  aut  falsitate 
Axiomatis  XI  Euclidei  (a  priori  hand  unquam  decide  nda]  independentem :  adjecta 
ad  casum  falsitatis,  quadratura  circuh  geometrica.  Maros-Vasarhely  1832. 

WIENER,  II.,  Die  Zusammensetzung  zweier  endlicher  Schraubungen  zu  einer  einzigen. 
Zur  Theorie  der  Umwendungen.  Ober  geometrische  Analysen.  Ober  geometrische 


126  FRIEDRICH  BACHMANN 

Analysen,  Fortsetzung.  Uber  die  aus  zwei  Spiegelungen  zusammengesetzten  Ver- 
wandtschaften.  Obey  Gruppen  vertauschbarer  zweispiegeliger  Verwandtschaften. 
Berichte  iiber  die  Verhandlungen  der  Kgl.  Sachsischen  Gesellschaft  dcr 
\Vissenschaften  zu  Leij>zig.  Mathcmatisch-naturwissenschaftliche  Klasse.  Band 
42  (1890),  S.  13-23,  71-87,  245-267;  Band  43  (1891),  S.  424-447,  644-673;  Band 
45  (1893),  S.  555-598. 

DEHN,  M.,  Die  Legendreschen  Sdtze  uber  die  Winkelsumme  im  Dreieck.  Mathemati- 
schc  Annalen.  Band  53  (1900),  S.  404-439. 

HESSENBERG,  G.,  Neite  Begri'tndung  der  Spharik.  Sitzungsberichte  der  Berliner 
Mathematischcn  Gesellschaft.  Band  4  (1905),  S.  69-77. 

HJKLMSLEV,  J.,  Neue  Begrimdung  der  ebenen  Geometrie.  Mathematische  Annalen. 
Band  64  (1907),  S.  449-474. 

SCHUR,  F.,  Grundlagen  der  Geometrie.  Leipzig  1909.  X+192  S. 

HJKLMSLEV,  J.,  Einleitung  in  die  allgemeine  Kongruenzlehre.  Uet  Kgl.  Danske  Vi- 
denskabcrnes  Sclskab,  Matematisk-fysiskc  Mcddelelser.  Band  8  (1929),  Nr.  11. 
Band  10  (1929),  Nr.  1.  Band  19  (1942),  Nr.  12.  Band  22  (1945),  Nr.  6.  Band  22 
(1945),  Nr.  13.  Band  25  (1949),  Nr.  10. 

HKSSKNHKKG,  G.,  Grundlagen  der  Geometrie.  Berlin /Leipzig  1930.  143  S. 

THOMSEN,  G.,  Grundlagen  der  Elementaygeometrie  in  gritppenalgebraischer  Bchand- 
lung.  Hamburger  Mathematische  Kinzelschnften.  Heft  15.  Leipzig/Berlin 
1933.  88  S. 

REIDKMKISTER,  K,.  Geometria  proicttiva  von  euchdea.  Kendiconti  del  Scminario 
Mathcmatico  della  R.  Universita  di  Roma.  Serie  TTI,  volume  1,  parte  2 
(1934),  p.  219-228. 

BACHMANN.  F.f  Eine  Begmndung  dcy  absoluten  Geometrie  in  dev  Ebene.  Mathema- 
tische Annalen.  Band  113  (1936),  S.  424-451. 

SCHMIDT,  ARNOLD,  Die  Dualitat  von  Inzidenz  und  Senkrechtstehen  in  der  absoluten 
Geometrie.  Mathematische  Annalen.  Band  118  (1943),  S.  609-635. 

SPKRNEK,  E.,  Ein  gyuppentheoretischer  Beweis  des  Satzes  von  Desargues  in  der  abso- 
luten Axwmatih.  Archiv  dcr  Mathematik.  Band  5  (1954),  S.  458-468. 

SCIIUTTE,  K.,  Die  Winhelmetrik  in  der  affin-orthogonalen  Ebene.  Mathematische 
Annalen.  Band  130  (1955),  S.  183-195. 

Gyuppentheoyetisches  Axiomensystem  einer  verallgemeinerten  euhlidischen  Geo- 
metrie. Mathematische  Annalen.  Band  132  (1956),  S.  43-62. 

BACHMANN,  F.,  Aufbau  der  Geometrie  aus  dem  Spiegelungsbegriff.  Die  Grundlehrcn 
der  mathematischen  Wissenschaften.  Band  96.  Bcrlin/Gottingen/IIeidelberg. 
1959.  XIV  4-312S. 
[In  dem  an  letzter  Stelle  gcnannten  Buch  ist  dcr  hier  skizzierte  axiomatische. 

Aufbau  der  ebenen  absoluten  Geometrie  durchgefiihrt.] 


Symposium  on  the  Axiomatic  Method 


NEW  METRIC  POSTULATES  FOR  ELLIPTIC  n-SPACE 

LEONARD  M.  BLUMENTHAL 

University  of  Missouri,  Columbia,  Missouri,    U.S.A. 

1.  Introduction.  In  its  most  general  aspects,  a  distance  space  is  formed 
from  an  abstract  set  5  by  mapping  the  set  of  all  ordered  pairs  of  elements 
of  S  into  a  second  set,  which  may  be  a  subset  of  5.  It  is  suggestive  to  call 
the  elements  of  5  points,  and  the  elements  of  the  second  set  distances. 
Distance  spaces  are  particularized  by  specifying  the  distance  sets  and  by 
postulating  properties  of  the  mapping.  If,  for  example,  the  distance  set  is 
the  class  of  non-negative  real  numbers,  and  the  mapping  that  associates 
with  each  pair  p,  q  of  elements  of  the  set  5  the  number  pq  is  definite  (that 
is,  pq  —  0  if  and  only  if  p  =  q),  and  symmetric  (pq  —  qp),  the  resulting 
distance  space  is  called  semimetric.  The  class  of  metric  spaces  is  obtained 
by  assuming,  in  addition,  that  if  p,  q,  r  e  S,  the  associated  distances  pq, 
qr,  py  satisfy  the  triangle  inequality,  Pq  +  qr  ^  pr.  For  each  positive 
integer  n,  the  classical  spaces  (euclidean,  spherical,  hyperbolic,  and 
elliptic)  of  n  dimensions  are  metric  spaces. 

A  given  distance  space  £  is  characterized  metrically  with  respect  to  a 
prescribed  class  of  distance  spaces  when  necessary  and  sufficient  con- 
ditions, expressed  wholly  and  explicitly  in  terms  of  the  distance,  are  formu- 
lated in  order  that  any  member  of  the  class  may  be  mapped  onto  27  in  a 
distance-preserving  manner.  A  mapping  of  this  kind  is  called  a  congruence. 
It  is  clear  that  such,  a  metric  characterization  induces  an  axioniatization 
of  27  in  terms  of  the  sole  (geometric)  primitive  notions  of  point  and  distance 
when  the  given  class  of  comparison  spaces  is  sufficiently  general. 

Euclidean  spaces  Rn  were  the  first  to  be  studied  in  this  manner.  In  his 
Zweite  Untersuchung,  Menger  obtained  metric  postulates  for  euclidean 
w-space  by  first  solving  the  more  general  problem  of  characterizing 
metrically  subsets  of  Rn,  with  respect  to  the  class  of  semimetric  spaces 
[6].  With  this  accomplished,  the  solution  of  the  space  problem  follows 
upon  adjoining  to  the  metric  characterization  of  its  subsets  (with  respect 
to  the  class  of  semimetric  spaces)  those  metric  properties  that  serve  to 
distinguish  the  Rn  itself  among  its  subsets.  It  was  noted  by  W.  A.  Wilson, 
however,  that  though  none  of  Menger's  conditions  for  congruently 

127 


128  LEONARD    M.    BLUMENTHAL 

imbedding  an  arbitrary  semimetric  space  into  the  Rn  can  be  suppressed, 
the  set  of  assumptions  obtained  by  adjoining  to  those  conditions  the 
properties  that  individualize  the  Rn  among  the  subsets  (needed  to 
characterize  the  whole  Rn)  can  be  very  materially  reduced  [8].  Wilson's 
reduction  consists  in  replacing  Menger's  assumption  that  for  every  integer 
k,  (1  <  k  <  n),  each  (k  +  l)-tuple  of  points  of  a  semimetric  space  can  be 
congruently  imbedded  in  Rn,  by  the  much  milder  requirement  that  each 
four  points  be  imbeddable  in  R^.  The  crucial  imbedding  sets  are  thus 
quadruples  of  points,  regardless  of  the  dimension  of  the  euclidean  space 
being  characterized. 

The  following  comments  concerning  Wilson's  contribution  are  perti- 
nent. (1).  In  validating  the  sufficiency  of  his  "four-point"  property, 
Wilson  made  use  of  Menger's  imbedding  theorems  for  (7e  +  1) -tuples, 
(1  <  k  <  ^)-  A  simpler  argument  by  the  writer,  using  a  weaker  four- 
point  property,  is  quite  independent  of  those  results,  and  so  solved  the 
space  problem  without  any  reference  to  the  subset  problem  [1.  pp.  123- 
128J.  (2).  The  four-point  property  of  Wilson  suggests  numerous  weaker 
properties  which  have  been  investigated  by  the  writer  and  others  [2]. 
This  paper  is  concerned  with  an  investigation  of  weak  four-point  properties 
that  arise  in  the  metric  study  of  elliptic  spaces. 

2.  First  metric  axiomatization  of  elliptic  space.  Metric  postulates  for 
spherical  and  hyperbolic  spaces,  arising  from  their  metric  characteri- 
zations with  respect  to  the  class  of  semimetric  spaces,  were  established  by 
the  writer  in  1935  and  1937,  respectively.  *  But  the  numerous  metric 
abnormalities  of  elliptic  space  rendered  its  investigation  (in  the  purely 
metric  manner  imposed  by  the  program)  a  more  difficult  matter,  and  it 
was  not  until  1946  that  the  first  set  of  metric  postulates  for  finite  and 
infinite  dimensional  elliptic  spaces  was  obtained  [3].  Chief  among  the 
metric  features  of  elliptic  space  that  make  inapplicable  the  methods  used 
in  the  metric  characterizations  of  other  classical  spaces  are  the  following. 

(1)  Distinction   between   congruence   and  superposability .   Defining   a 
motion  as  a  congruent  mapping  of  a  space  onto  itself,  two  subsets  are 
called  superposable  provided  there  is  a  motion  that  maps  one  onto  the 
other.  In  contrast  to  the  other  classical  spaces,  two  subsets  of  elliptic 
space  may  be  congruent  without  being  superposable. 

(2)  Distinction  between  "contained  in"  and  "congruently  contained  in". 

i  See  [1]. 


NEW  METRIC  POSTULATES  FOR  ELLIPTIC  W-SPACE  129 

In  any  of  the  classical  spaces  other  than  the  elliptic,  a  subset  that  is 
congruent  with  a  subset  of  a  subspace  is  actually  contained  in  a  subspace 
of  the  same  dimension.  This  is  not  the  case  in  elliptic  space. 

(3)  Dependence  not  a  congruence  invariant.  An  w-tuple  of  a  space  is 
usually  called  dependent  when  it  is  contained  in  an  (m  —  2) -dimensional 
subspace.  With  this  convention,  a  dependent  w-tuple  of  elliptic  space  may 
be  congruent  with  one  that  is  not  dependent. 

(4)  Non-linearity  of  the  equidistant  locus.  The  locus  of  points  of  the 
elliptic  plane  that  are  equidistant  from  two  distinct  points  consists  of  two 
mutually  perpendicular  elliptic  lines,  and  hence  no  subset  contained  in 
two  such  lines  forms  a  metric  basis. 

(5)  Cardinality  of  the  maximal  equilateral  set.  The  elliptic  plane  con- 
tains six  points  with  all  fifteen  distances  equal.  No  equilateral  septuple 
exists  in  the  plane  or  in  elliptic  three-space.  The  cardinality  of  the  maximal 
equilateral  subset  of  elliptic  w-space  is  not  known  for  n  >  3. 

The  following  set  of  metric  postulates  for  elliptic  space  (with  positive 
space  constant  r)  was  established  in  [3].  Let  Er  denote  a  distance  space 
containing  at  least  two  points. 

POSTULATE  I.     Er  is  semimetric. 

POSTULATE  II.  Er  is  metrically  convex  (that  is,  if  a,  c  e  Er,  a  =(=  c, 
Er  contains  a  point  b  such  that  a  =|=  b  =)=  c  and  ab  +  be  —  ac) . 

The  point  c  is  said  to  be  between  a  and  c,  and  the  relation  is  symbolized 
by  writing  abc. 

POSTULATE  III.     The  diameter  of  Er  is  at  most  nrj2. 
POSTULATE  IV.     Er  is  metrically  complete. 

POSTULATE  V.  //  p,  q  e  Er>  pq  4=  nrjl,  then  Er  contains  points  p*,  q* 
such  that  pqp*,  qpq*  subsist,  and  pp*  =  qq*  =  nr\2. 

Two  points  with  distance  nr/2  are  called  diametral.  If  p  e  Er,  p*  or 
d(p)  will  denote  a  diametral  point  of  p ;  that  is,  pp*  —  pd(p)  =  nr/2. 

DEFINITION.  Three  points  of  Er  (not  necessarily  pairwise  distinct]  are 
LINEAR  provided  the  sum  of  two  of  the  three  distances  they  determine  equals 
the  third. 

If  pit  pz,  PS  e  Er  let  zl*  denote  the  determinant  ey  cos(pipj/r)\,  (i,  /=  1 , 
2,  3),  where  every  ey  is  1,  except  that  £23  =  £32  =  —  1- 

A  symmetric  matrix  (e#),  e#  =  ejt  =  ±  1,  eu  =  1,  (i,  j  =  1,2,  . . . ,  m) 
is  called  an  EPSILON  MATRIX. 


130  LEONARD    M.    BLUMENTHAL 

POSTULATE  VI.  Let  po,  pi,  . . .,  p*  be  any  five  pairwise  distinct  points 
of  Er  with  (i)  two  triples  linear,  and  (ii)  the  determinant  A*  of  three  of  the 
points  (one  of  which  is  common  to  the  two  linear  triples)  negative.  Then  an 
epsilon  matrix  (f#) ,  (i,  j  —  0,  1 ,  . . . ,  4)  exists  such  that  all  principal  minors 
of  the  determinant  |e#  cos(pipj/r)\t  (i,  j  =  0,  1,  . .  .,  4)  are  non-negative. 

These  postulates  insure  that  all  subspaces  of  Er  (properly  defined)  of 
finite  or  infinite  dimensions  are  elliptic  (that  is,  congruent  with  the 
classical  elliptic  spaces  with  space  constant  r). 

To  axiomatize  elliptic  w-space  En,r,  for  a  given  positive  integer  n,  it 
suffices  to  adjoin  the  following  (local)  postulate. 

POSTULATE  VII.  The  integer  n  is  the  smallest  for  which  a  point  q$  of  Er 
and  a  spherical  neighborhood  U(qo)  exist  such  that  each  n  -f  2  points  PQ,  p\t 
.  .  .,  Pn+i  of  U(qo)  have  the  property  that  if  there  is  an  epsilon  matrix  (ey) 
such  that  no  principal  minor  of  the  determinant  !*?$/  cus(pipj)/r\,  (i,  j  =  0,  1, 
. . . ,  n  +  1 )  is  negative,  then  an  epsilon  matrix  (ey)  exists  such  that  no 
principal  minor  of  \ey  cos(pipj/r)\,  (i,  j  =  0,  1,  .  .  . ,  n  +  1)  is  negative,  and 
the  determinant  vanishes. 

Interpreted  geometrically,  Postulate  VI  asserts  that  each  quintuple  (of 
a  prescribed  subclass  of  the  class  of  all  those  quintuples  of  Er  containing 
two  linear  triples)  is  congruently  imbeddable  in  an  elliptic  space  with 
space  constant  r.  The  condition  zl*  <  0  means  that  the  perimeter  of  the 
three  points  for  which  it  is  formed  is  less  than  nr  and  imparts  a  local 
nature  to  the  postulate.  It  is  observed,  moreover,  that  the  specific 
(elliptic)  character  of  the  space  defined  by  Postulates  I-VI  is  determined 
by  Postulate  VI  alone.  In  view  of  the  discussion  above  of  four-point 
properties,  it  is  natural  to  seek  to  replace  the  five-point  property  ex- 
pressed in  Postulate  VI  by  simpler  four-point  properties.  The  suggestion 
to  do  so,  made  in  the  concluding  section  of  [3],  was  acted  upon  in  the 
(unpublished)  Missouri  doctoral  dissertation  of  J.  D.  Hankins  (supervised 
by  the  writer)  which  provides  the  basis  for  the  present  contribution  [5]. 

3.  Classes  of  quadruples  and  corresponding  four-point  properties.  The 

following  seven  classes  of  semimetric  quadruples  of  pairwise  distinct 
points  play  a  role  in  what  follows. 

A  semimetric  quadruple  PI,  p^  pz,  p*  belongs  to  class 

{Qi}  if  and  only  if  it  contains  a  linear  triple, 

{$2}  if  and  only  if  p2pzP*  subsists  and  p^pz  =  Pzp*, 

{@3}  if  and  only  if  p2pzp*  subsists,  pzpz  =  pap*,  and  the  perimeter  of 


NEW  METRIC  POSTULATES  FOR  ELLIPTIC  tt-SPACE  131 

every  three  of  the  fourpoints  is  less  than  nr  +  e,  where  r  and  e  are  arbi- 
trarily chosen  positive  constants, 

{$4}  if  and  only  if  pzpzp*  subsists  and  pip2  =  pip*, 
{Qs}  if  and  only  if  p2psp4  subsists,  p2ps  =  pap*,  and  pip2  =  pip*, 
{QQ}  if  and  only  if  pzpzpi  subsists,  pzps  =  2£3£4,  and  p\pz  =  pips, 
{Qi}  if  and  only  if  the  quadruple  contains  two  linear  triples. 
Clearly  {0,}  C  {&},  (»  =  2,  3.  . . . ,  7)  and  {0,}  C  {0,}. 

DEFINITION.     ^4   semimetric  space  has  the  elliptic  WEAK,  FEEBLE,  e- 

FEEBLE,  ISOSCELES  WEAK,  ISOSCELES  FEEBLE,  EXTERNAL   ISOSCELES 

FEEBLE  four -point  property  if  every  quadruple  of  its  points  of  class  {Qi}, 
{Qz}>  •  •  -  >  {Qs}'  respectively,  is  congruently  imbeddable  in  an  elliptic  space 
with  space  constant  r.  The  space  has  the  ELLIPTIC  STRONG  TWO-TRIPLE 
PROPERTY  if  each  of  its  quadruples  of  class  {Qi}  is  congruently  imbeddable  in 
an  elliptic  line. 

The  writer  has  established  elsewhere  the  following  imbedding  theorem.  2 

THEOREM  3.1.  A  semimetric  m-tuple  p\,  p%,  .  . .,  pm  is  congruently  im- 
beddable in  elliptic  n-space  EntT  if  and  only  if  (i)  pipj  <  7irj2,  (i,  j  —  1,2, 
.  .  .,  m),  and  (ii)  there  exists  an  epsilon  matrix  (?^),  (i,  j  —  1,  2,  .  .  .,  m), 
such  that  the  determinant  |e#  cos(pipj)/r  ,  (i,  j  —  1,2,  . . .,  m),  has  rank  not 
exceeding  n  +  1,  with  all  non-vanishing  principal  minors  positive. 

With  the  aid  of  this  theorem  (a)  conditions  for  the  congruent  imbedding 
in  elliptic  space  of  each  quadruple  of  the  classes  {Qi},  {Qz},  .  . .,  {$7}  are 
expressed  in  terms  of  the  six  distances  determined  by  the  quadruple,  and 
(b)  if  a  quadruple  of  class  {Qi},  (i  =  1,  2,  . .  .,  6),  is  congruently  im- 
beddable in  Entr,  then  it  is  congruently  imbeddable  in  E%%r. 

DEFINITION.  A  Er  space  is  any  space  for  which  Postulates  I-V  are 
valid. 

The  following  sections  investigate  Er  spaces  that  have  one  or  more  of 
the  four-point  properties  defined  above. 

4.  Spaces  Zr  with  the  elliptic  weak  four-point  property.  Let  Zr(w) 
denote  a  Er  space  with  the  elliptic  weak  four-point  property.  It  is  proved 
in  [3]  that  the  weak  four-point  property  is  possessed  by  spaces  in  which 
Postulates  I-VI  are  valid;  that  is,  in  the  presence  of  Postulates  I-V, 
Postulate  VI  implies  the  weak  four-point  property.  This  section  is 

2  See  [1],  p.  208. 


132  LEONARD   M.    BLUMENTHAL 

devoted  to  showing  that  in  the  same  environment,  the  weak  four-point 
property  implies  Postulate  VI. 

The  following  three  theorems  were  either  established  in  [3]  by  using  the 
weak  four-point  property  (instead  of  Postulate  VI)  or  their  proofs  are 
immediate. 

THEOREM  4.  1  .  Each  Zr(w)  space  is  metric  and  every  triple  of  points  is 
congruently  imbeddable  in  E2,r> 

THEOREM  4.2.  Two  distinct  non-diametral  points  p,  q  of  a  Er(w)  space 
are  endpoints  of  a  unique  metric  segment  (denoted  by  seg(p,  q)). 

COROLLARY.     If  p,  q  e  Zr(w)}  p'q'  e  Enj,  pq  =  p'q'  4=  nr/2,  there  exists 
a  unique  extension  of  the  congruence  p,  q  &  pf,  q'  to  the  congruence  seg(/>,  q) 
',  q').  3 


THEOREM  4.3.  //  p,  q  e  Er(w),  (0  <  pq  <  nr/2)  there  is  exactly  one 
point  p*  of  Er(w)  such  that  pqp*  subsists  and  pp*  —  nr/2. 

Now  if  p,  q  G  Er(w),  (0  <  pq  <  nr/2)  and  p*,  q*  are  the  unique  points 
diametral  to  p,  q,  respectively,  with  pqp*  and  qpq*  subsisting,  the  unique 
metric  segments  seg(/>,  q),  seg(^,  />*),  seg(/>*,  q*),  seg(<7*,  p)  have  pairwise 
at  most  endpoints  in  common  and  it  follows  that  the  two  metric  segments 


q,  p*}  =  seg(£,  q)  +  seg(?,  p*), 
seg(/>,  q*,  p*)  =  seg(/>,  q*)  +  seg(?*,  p*), 
have  only  p,  p*  in  common. 

DEFINITION.  //  p,  q  e  Er(w),  (0  <  pq  <  nr/2),  then  seg(£,  q,  p*)  -\- 
seg(^),  q*,  p*)  is  called  a  one-dimensional  subspace  Erl(p,  q)  of  Er(w), 
with  base  points  p,  q,  where  pqp*  and  qpq*  subsist. 

THEOREM  4.4.  A  one-dimensional  sitbspace  Erl  of  Er(w)  is  congruent 
with  the  elliptic  line  £i,r- 

PROOF.  Erl  =  seg(£,  q,  p*)  +  seg(^>,  q*t  p*),  where  p,  q  are  base 
points  of  Ef1.  It  follows  from  the  weak  four-point  property  that  points 
a,  b,  a*,  b*  of  an  elliptic  line  E\,r  exist  such  that  p,  q,  p*,  q*  ^  a,b,a*,  6*, 


3  The  notation  pi,  p2,  .  .  .,  pk  **  qi,  qz,  .  .  .,  qk  signifies  that  pipj  —  qiqj,  (i,  j=  1, 
2,  .  .  .,  k).  The  symbol  "  «*"  is  read,  "is  (are)  congruent  to". 


NEW  METRIC  POSTULATES  FOR  ELLIPTIC  W-SPACE  133 


and  the  two  congruences 

(*)  seg(£,  q,  p*)  **  seg(0,  b,  a*), 

(**)  seg(£,  q*,  p*)  *  seg(a,  &*,  a*), 


map  Erl(pt  q)  onto  E\>r(a,  b).  To  show  the  mapping  is  a  congruence  it  is 
clear  that  only  the  two  following  cases  need  be  examined  in  detail. 

Case  1  .  x  E  scg(/>*,  q*),  p*  =\^  x  =%=  q*,y  E  seg(^,  p*),  ^  4=  y  4=  £*.  From 
</y/>*  and  qp*q*  follows  yp*q*,  and  in  a  similar  manner  yp*x  subsists. 
Hence  xy  =  xp*  +  p*y,  and  letting  x',  y'  correspond  to  x,  y  by  the  con- 
gruences (**),  (*),  respectively,  the  same  considerations  establish  x'y'= 
x'a*  +  a*y'.  Since  xp*  =  x'a*  and  p*y  =  a*y',  then  xy  —  x'y'.  4 

Case  2.  x  E  scg(</,  p*),  q  =^  x  4-  #*,  y  e  seg(A  <7*)>  ^  41  y  =f=  5'*-  Since 
</*/>*,  qp*q*  imply  xp*q*t  and  />y</*,  pq*p*  imply  yq*p*,  the  quadruple 
x,  p*t  q*,  y  contains  two  linear  triples  and  hence  points  x"  ',  y",  />",  q"  of 
/£],r  fxist  such  that  ,r",  y",  ^>",  </"  ^  A;,  y,  ^>*,  ^*.  Since  x't  a*,  b*  z&  x,  p*, 
q*  &  x",  p",  q",  a  motion  G  of  £i,r  onto  itself  exists  with  G(x",  y",  p",  q") 
—  (x',y,a*tb*).  But  ya*  =--  y"p"  ~—  yp*  =  y'a*.  and  yb*  =  y"q"  = 
yq*  =  y'b*.  It  follows  that  y  =  y7  (since  a*b*  =£  jrr/2)  and  so  xy  —  .r'y'.  5 

LKMMA  4.1.     7/  s,  /  G  lir1  (0  <  st  <  nr/2),  then  scg(s,  /)  C  ^r1. 
The  proof  may  be  taken  from  [3]. 

LKMMA  4.2.  Any  pair  of  distinct  points  of  Er(w)  is  contained  in  a 
unique  subspace  Er^. 

PROOF.  If  the  pair  is  non-diametral,  the  result  is  proved  as  in  [3].  Let 
p,  p*  denote  a  diametral  point  pair  of  Er(w)  and  suppose  q  e  Er(w)  with 
Pqp*.  The  unique  subspace  Erl(p,  q)  contains  p,  p*,  and  by  Theorem  4.4, 
Erl(p,  q)  ^  Eitr(pf,  q').  Let  E*  denote  any  one-dimensional  subspace  of 
Er(w)  containing  p  and  p*,  and  suppose  x  E  E*,  x  =^  p,  p*,  q.  Since  there 
are  two  linear  triples  in  the  quadruple  p,  q,  x,  p*,  then  p,  q,  x,  p*  ^  p", 
q",  x",  d(p")  of  Eitr(p',  q'),  where  p"d(p")  =  nr/2.  A  motion  G  exists  such 
that  G(p",  q",  x",  d(p"))  =  (pf,  q',  x,  d(p')),  and  one  of  the  relations 
p'xq',  q'xd(p'),  d(p')xd(q'))  d(q')xp'  subsists,  or  x  coincides  with  one  of  the 
points  p',  q',  d(p'))  d(q').  But  then  x  satisfies  the  corresponding  relation 
in  the  unprimed  letters,  and  Lemma  3.1  yields  E*CErl(p,q).  Inter- 
changing the  roles  of  E*  and  Erl  gives  Erl(p,  q)  C  E*. 

4  Obvious  modifications  of  the  argument  arc  used  in  case  x  =  q*,  y=q,  etc. 

5  No  difficulties  arc  encountered  when  x,  y  are  not  interior  points  of  the  segments 
from  which  they  are  chosen. 


134  LEONARD    M.    BLUMENTHAL 

LEMMA  4.3.  Two  congruent  triples  pi,  p%,  pz  and  PI,  p2  ,  Pz  of  E^>r 
are  superposable  if  (i)  one  of  the  distances  ptpj(i,  j  =  1 ,  2,  3)  equals  nr/2,  or 
(ii)  A*(pi,  P2,  Pz)  <  0. 

PROOF.     The  proof  is  given  in  [3]. 

THEOREM  4.5.  Let  pi,  p2,  pz  be  three  pairwise  distinct  points  of  Er(w) 
with  A *(Pi,  p2,  PS)  <  0,  and  p\  ,  p%  ,  Pz  points  of  E^,r  with  PI,  p^,  pz  z& 
PI,  pz>  PS-  The  congruences 

(1)  Erl(pl,p2)   ™Ei,r(pl',p2f), 

(2)  Erl(pl,  pz)   *  Ei,r(pl,  pz), 

determine  uniquely  the  congruence, 

Er^pl,  p2)  +  Erl(pl,  pz)   **  Ei,r(pl,  P*)  +  El,r(Pl>  #3'). 

PROOF.  Since  pi,  p2,  pz  are  congruently  imbeddable  in  E^.r,  and 
A*(pi,  p2,  #3)  <  0,  it  follows  that  A(pi,  pz,  #3)  =  |cos(^^)/r|,  (i,  j  =  1, 
2,  3)  is  non-negative,  and  no  one  of  the  distances  Pipj(i,  j  =  1 ,  2,  3)  is 
nrj2.  Hence  p\,  p%  and  pi,  pz  are  base  points  of  one-dimerisional  subspaces 
Erl(Pi,  p2)  and  Erl(pi,  pz),  respectively.  The  congruences  (1),  (2)  in  which 
pi  and  pi  (i  =  1 ,  2,  3)  are  corresponding  points  are  unique.  If  A(pi,  p2  ,pz) 
=  0,  then  PS£  Erl(pi,  p2)  and  so  Erl(pi,p2)  and  Er[(pi,  pz)  coincide. 
Similarly,  E\,r(p\ ',  pz)  =  E\tr(p\  ,  Pz)  and  the  theorem  follows  from 
Theorem  4.4. 

If,  now,  A(p\,  p2,  pz)  =  A(pi',  p2,  pz)  >  0,  then  (since  A*(pi,  pz,  pz)  < 
0)  the  points  pi ',  p2f,  pz'  neither  lie  on  an  elliptic  line,  nor  are  they 
congruent  with  points  of  a  line,  and  so  pi,  p^,  pz  are  not  contained  in  any 
Erl  of  Er(w).  The  congruences  (1),  (2)  give  a  mapping  of  Erl(p\,  p2)  + 
Erl(pi,pz)  onto  Eitr(pi',p2)  +  Ei,r(pi,  pz').  To  prove  the  mapping  a 
congruence,  suppose  x  e  Erl(pi,  p2),  y  e  Erl(pi,  pz),  and  let  x',  y'  denote 
their  corresponding  points  by  congruences  (1),  (2),  respectively. 

CASE  i.  %  E  seg(£i,  pz),  pi  4=  x  4=  p2,  y  e  seg(/>i,  pz).  The  possibilities 
y  =  pit  y  =  pz  offer  no  difficulties.  Supposing  that  piypz  holds,  then 
points  pi",  y",  £2",  pz"  of  E2,r  exist  with  pi,  y,  p2,  pz  **  pi",  y" ,  p2f,  pz", 
and  since  pi,  p2,  pz  ***  P\,  PI,  pz  ^  pi",  p2",  PZ",  with  A*(pi,  p2,  pz)  <  0 
a  motion  G  of  £2,r  exists  such  that  G(pi",  y" ,  p2",  pz")  =  pi ,  y,  p2,  pz'- 
It  is  easily  seen  that  y  =  y'  and  hence  p^y'  —  p^y. 

Now  from  p\%pi  follows  the  existence  of  points  such  that  pi" t  p2", 
x",  y"  ^  PI,  p2,  x,  y,  where  the  first  quadruple  is  in  E^r)  and  p\" ',  p2f,  y" 


NEW  METRIC  POSTULATES  FOR  ELLIPTIC  tt-SPACE  135 

are  not  necessarily  those  points  (with  the  same  notation)  considered  in 
the  preceding  paragraph.  From  A*  (pi,  p2,  />a)  <  0  follows 


arid  so  A*(pi,  p%,  y)  <  0.  This  permits  applying  to  the  quadruple  pi,  p2, 
x,  y  the  argument  applied  above  to  pi,  p%,  p3,  y  ,and  the  congruence 
PI,  p2,  x,  y  ^  pi,  p2f,  x',  y'  is  obtained,  yielding  xy  =  x'y'. 

CASK  II.     x<=seg(p2,di(pi)),  p2  =N  x  =\=  di(pi),  y  e  seg(£i,  p3,  d2(pi))t 
pi  4-  y  =4=  d2(pi),    where   d\(pi),    d%(pi)    denote   points   of   Erl(pi,  p2), 
Er*-(pit  PS),  respectively,  that  are  diametral  to  p\. 

Let  {qj}  be  a  point  sequence  of  Erl(pi,  p^}  with  the  limit  pi,  and  piq$2 
(j  =  1,2,  .  .  .).  An  index  m  exists  such  that  qmpi  +  piy  +  yqm  <  nr.  For 
setting  k  =  nr/2  —  piy  >  0,  and  selecting  m  so  that  qmpi  <  &/2  gives 
qmpi  +  piy  +  3>ft»  5;  2(qmpi  +  £iy)  <7ir  —  k.  It  follows  that  zl*(£i, 
?m,  ^  <  0. 

The  quadruple  £1,  p%,  qm,  y  contains  the  linear  triple  pi,  p$,  y,  and 
consequently  pi,  pz,  qm,  y  ^  pi",  pz"  ,  qm",y">  with  the  latter  quadruple  in 
£2,r.  By  Case  I,  pi,  pa,  qm  ^  pi,  pz,  qm',  and  since  A*(pi,  qm,  y)  <  0  for 
each  point  y  of  seg(£i,  p3,  dz(pi)),  then  A*(PI",  pB",  qm")=A*(pi,  p3,  qm) 
<  0,  and  a  motion  of  E«2,r  sending  pi',  p^",  qm"  into  pi,  £3',  qm',  re- 
spectively, gives  pi,  pz,  qm,  y  &  pi,  pa,  qmf,  y.  The  linearity  of  pi,  p$,  y 
implies  that  of  pi,  p^  ,  y  and  pi',  ps  ,  y'.  Consequently  pi',  pa,  y  & 
pi,  p3,  y  ^  pi,  pz,  y'  implies  y  =  y'  and  qmy  ^=  qm'y'. 

Turning  now  to  the  quadruple  PI,  qrn,  x,  y,  the  linearity  of  pi,  qm,  x 
together  with  the  relations  pi,  qm,  y  &  pi,  qmf  ,  y',  A*(pi,  qm,  y)  <  0, 
permits  applying  the  above  procedure  to  obtain  xy  =  x'y'. 

The  various  cases  arising  from  x  and/or  y  coinciding  with  one  of  the 
points  pi,  p2,  ps,  di(pi),  d^(pi)  are  all  easily  handled,  and  we  may  con- 
clude that 

(3)    seg(£i,  p2,  di(pi))  +  seg(£i,  p3,  d2(pi))  f* 


CASK  III.  x  e  Erl(pi,  p2),  y  eErl(pi,  pz),  with  p2pi%,  PzPiy  sub- 
sisting, and  A*  (pi,  p2,  y)  <  0,  A*  (pi,  x,  y)  <  0. 

The  method  used  in  Case  I  is  readily  applied  to  yield  p$x  =  pz'x'  and 
p2y  =  p2'yr.  Now  pit  p2,  x,  y  &  pi",  p2ff,  x",  y",  points  of  E%%r.  The  re- 
lations pi,  p2)  y  *s  pi,  p2,  y',  A*(pi',  pj,  y')  =  A*(pi,  p2,  y)  <  0,  p2,  pi*, 
pip2  4=  nr/2  yield  pi',  p2".  x"  ,  y"  ^  pi,  p2f,  x',  y',  and  so  xy  =  x'y'. 


136  LEONARD   M.    BLUMENTHAL 

Let  0i,  02  denote  points  that  satisfy  the  conditions  imposed  above  on 
x,  y,  respectively.  Using  those  points  in  place  of  p^,  p%,  and  proceeding  as 
in  Case  II  yields 

(4)    seg(£i,  oi,  di(pi))  +  seg(£i,  02,  d2(pi))  & 

seg(K>  01',  di(pi'))  +  seg^',  02',  *2(£i'))- 
ASSERTION.     seg(#i,  0i,  <*i(£i))  +  seg(£i,  £3>  ^2(^1))  ^  seg(£i',  01  Vi(£i')) 


PROOF.  Suppose  ^  e  seg(0i,  ^i(^i)),  and  y  e  seg(/>i,  />a,  d%(pi)).  It  is 
easily  seen  that  seg(/>i,  01)  contains  an  interior  point  q,  arbitrarily  close 
to  pi,  such  that  qp\  +  p\y  +  yq  <  nr,  and  qpi  +  pip%  -f  ^>3<7  <  rcr. 

By  (4),  PI,  q,  02  &  />i',  <?',  02',  and  /4*(/>i',  q',  02)  <  0  follows  from 
A*  (pi,  0i,  02)  <  0  and  piqoi.  The  familiar  procedure  now  yields  PI,  p%, 
q,  02  ^  />i',  ^3',  q',  03'  ,  points  of  £2,7--  Similarly,  it  is  shown  that  />i,  ^3,  r/, 
y  ^  #1',  #3',  ^',  y'-  Finally,  PI,  q}  x,  y  ^  pi",  q",  x"  ,  y",  points  of  E%tr, 
since  xqpi  holds,  and  from  pi,  q,  y  ^  p\  ,  qf  ,  y',  A*  (pi  ,  q'  ,  y')  <  0, 
pi'q'x',  pi'q'  ~-\-  nr/2,  it  follows  that  pi,  q,  x,  y  ^  pi',  q'  ,  x',  y'  ,  and  xy  — 

*y. 

Cases  not  explicitly  treated  above  are  either  trivial  or  are  handled  in  a 
similar  manner. 

THEOREM  4.6.     Postulate  VI  is  valid  in  Er(w). 

PROOF.  Let  />o,  PI,  p2,  ps,  P*  be  any  five  pairwise  distinct  points  of 
Er(w)  with  A*(PQ,  pi,  p^}  <  0,  and  each  of  the  triples  />o,  PI,  pa  and 
Po,  P'2,  p4  linear.  Then  by  Theorem  4.5  the  sum  Er1(po,  pi)  +  Eri(po,  p2) 
is  congruently  imbeddable  in  E^r,  and  since  p%,  p^  are  elements  of  the 
first  and  second  summand,  respectively,  the  five  points  po,  PI,  .  .  .  ,  p$  are 
congruently  imbeddable  in  E^tr- 

It  follows  from  Theorem  3.  1  that  the  quintuple  has  the  property  stated 
in  the  conclusion  of  Postulate  VI. 

THEOREM  4.7.  Postulates  I,  II,  III,  IV,  V,  VI^,  VII  are  metric  postu- 
lates for  elliptic  n-space,  where  Postulate  VI  w  postulates  the  elliptic  weak 
four-point  property.  6 

5.  Metric  spaces  with  the  elliptic  feeble  four-point  property.  The  ob- 

jective of  this  section  is  to  show  that  if  Postulate  I  be  strengthened  to 

6  Postulate  VI  M,  may  be  formulated  to  make  Postulate  III  unnecessary. 


NEW  METRIC  POSTULATES  FOR  ELLIPTIC  W-SPACE  137 

require  metricity,  then  the  class  of  quadruples  assumed  congruently 
imbeddable  in  E^.r  may  be  restricted  to  the  proper  subclass  {Qz}  of  class 
{Qi}-  Whether  this  restriction  may  be  made  without  strengthening 
Postulate  II  is  an  open  question.  Let  Er(f)  denote  a  metric  space  with  the 
elliptic  feeble  fourpoint  property  in  which  Postulates  II-V  are  valid. 
The  following  theorems  are  easily  established. 

THEOREM  5.1.  Each  point  triple  of  Er(f)  is  congruently  imbeddable 
in  E2,r- 

THEOREM  5.2.  Two  distinct  non-diametral  points  of  Er(f)  are  joined  by 
exactly  one  metric  segment. 

This  follows  from  (1)  the  existence  of  at  least  one  metric  segment 
joining  each  two  distinct  points  of  any  complete,  metrically  convex, 
metric  space,  (2)  the  uniqueness  of  midpoints  for  nondiametral  pointpairs 
of  Er(f)  (a  consequence  of  the  feeble  fourpoint  property,  since  such  points 
are  unique  in  /i2,r),  <ind  (3)  the  fact  that  each  segment  of  Er(f)  is  the 
closure  of  the  clyadically  rational  points  of  the  segment. 

COROLLARY.     The  congruence  p,  q  m  />',  q',  0  <  pq  <  nr  2,  (/>',  q'eE^r) 
has  a  unique  extension  to  the  congruence  seg(/>,  q)  ^  seg(/>',  </'). 

REMARK.  There  is  exactly  one  seg(p,  q,  />*),  with  />,  q,  />*  e  Er(f)  and 
pqp*  subsisting. 

LEMMA  5.1.  //  />,  s,  m,  d(s)  e  Er(f)  such  that  sd(s)  =  nr,'2,  m  is  a  mid- 
point of  s,  d(s)  (that  is,  sm  —  md(s)  —  (\)sd(s))  and  pd(s)  <  nr/2,  then 
points  p',  s',  m',  d(s')  of  E^,r  exist  such  that  (p,  s)  -f  seg(;w,  d(s))  ^ 
(P' >  s')  +  seg(w',  d(s')),  where  (p,  s),  (/>',  s')  denote  the  sets  consisting  of  the 
points  exhibited. 

PROOF.  If  m\  denotes  the  unique  midpoint  of  p,  d(s),  the  feeble  four- 
point  property  gives  m\,  s,  m,  d(s)  z&  mi',  s',  m' ,  d(s/),  with  the  latter 
points  in  #2,r-  Similarly,  p,  m\t  d(s)y  m  ^  p",  m"\y  d(s"),  m",  points  of 
£"2,r,  and  since  A*(m,  m\,  d(s))  <  0,  a  motion  of  E^tr  yields  p,  mi,  d(s), 
m  &  p',  mi,  d(s'),  m'.  The  theorem  is  proved  by  showing  that  the  mapping 

p  «-»  p't     s  <->  s',     seg(m,  d(s)  ^  seg(m',  d(s')) 

is  a  congruence. 

If  x  e  seg(m,  d(s))  then  sx  =  sm  +  mx  =  s'm'  +  m'x'  =  s'x'.  Since 
p,  mi,  d(s)f  s  *&  p",  mi",  d(s"),  s",  m\t  s,  d(s)  ^  m\  ,  s',  ^(s'),  and  sd(s)  = 
nr/2,  Lemma  4.3  yields  p,  mi,  d(s),  s  &  p*,  mi,  d(sf),  s'.  Then  p*,  mi, 


138  LEONARD    M.    BLUMENTHAL 

d(s')  x*  p',  mi,  d(s'),  mi'd(s')  4=^/2,  and  p*  e  E^r(mi',  d(s')),  (since 
pm\d(s}  holds),  imply  p*  —  p',  and  so  />,  m\t  d(s),  s  ^  pf,  m\  ,  d(sf),  s'. 
It  suffices  now  to  show  that  px  =  p'x',  for  x  an  interior  point  of 
seg(w,  d(s)).  If  W2  denotes  the  unique  midpoint  of  m,  d(s),  the  above 
procedure  is  applied  to  obtain  p,  mi,  m^,  d(s)  ^  p',  mi,  m% ',  d(s')t  where 
m,2  is  the  midpoint  of  m',  d(sf).  A  continuation  of  the  process  yields 
p,  mi,  q,  d(s)  £&  p',  mi,  q',  d(s'),  for  each  dyadically  rational  point  q  of 
seg(w,  d(s)),  (that  is,  for  each  point  q  of  seg(w,  d(s)  such  that  mq  =  y . 
md(s),  where  y  denotes  any  dyadically  rational  number).  Then  pq  —  p'q', 
and  since  the  set  of  all  the  points  q  is  dense  in  the  segment,  continuity  of 
the  metric  gives  px  —  p'x'. 

LEMMA  5.2.  Let  p,  s,  m,  d(s)  denote  pairwise  distinct  points  of  Er(f) 
such  that  (1)  sd(s)  =  nr/2,  (2)  m  is  a  midpoint  of  s,d(s),  and  (3)  xe 
seg(s,  m,  d(s))  implies  px  <  nrj2.  Points  p' ,  s',  m',  d(s')  of  Ez,r  exist  such 
that 

(p)  +  seg(s,  m,  d(s))  **  (p')  +  seg(s',  m',  d(s')). 

PROOF.  By  Lemma  5. 1 ,  points  pf,  s',  m',  d(sf)  of  E^,r  exist  such  that 
seg(s,  m,  d(s))  &  seg(s',  m',  d(s')),  ps  —  p's',  and  px  —  p'x'  if  x  e 
seg(w,  d(s)).  The  lemma  is  proved  by  showing  that  the  mapping  defined 

by 

P  <•  >  p',     seg(vS,  m,  d(s))  G&  seg(s',  m',  d(s')) 

is  a  congruence. 

It  suffices  to  prove  that  py  =  p'y',  yEscg(s,m),  s  =^  y  4=  m.  Now 
seg(s,  m,  d(s))  contains  a  point  x  such  that  px  <  px  <  nrf2  for  every 
point  x  of  that  segment.  Let  a  —  yir/2  —  px ,  and  subdivide  seg(w,  y)  into 
n  +  1  equal  subsegments  by  means  of  points  q^  —  m,  qi,  q^  .  . . ,  qn+i^y, 
such  that  qiqi+i  <  a,  and  qi-\qiqi-\\  subsists,  (i  =  1,  2,  .  ..,  n).  If  IE 
scg(m,  d(s))  with  mt  =  mqi,  then  A*(p,  m,  t)  <  0,  and  p,  m,  t  f&  p',  m',  t' 
by  the  preceding  lemma.  It  follows  that  p,  m,  t,  q\  &  p',  m',  t',  qi,  and  so 
pql  =^  P'q' i.  Since  A*(p,  m,  qi)  <  0,  and  p,  m,  q\  ^  p',  m',  qi  ,  the  above 
procedure  yields  pqz  =  p'q' 2,  and  repeated  application  of  the  process 
gives  py  =  pqn+i  =  p'q'n+i. 

LEMMA  5.3.  //  s,  d(s),  p,  q  denote  four  points  of  Er(f)  with  sqd(s), 
sd(s)  =  nr/2,  and  pxi  —  pX2  =  nr/2  for  two  distinct  points  xi,  X2  of 
scg(s,  q,  d(s)),  then  (p)  +  seg(s,  q,  d(s)  ^  (p')  +  seg(s',  q',  d(s'))9  and  p' is 
the  pole  of  seg(s',  q',  d(s')). 


NEW  METRIC  POSTULATES  FOR  ELLIPTIC  tt-SPACE  139 

The  proof,  based  upon  the  superposability  of  any  two  congruent  triples 
of  E2,r  with  a  pair  of  corresponding  distances  equal  to  nrj2t  offers  no 
difficulty. 

LEMMA  5.4.  //  s,  m,  d(s),  p  are  four  pairwise  distinct  points  of  Er(f) 
such  that  (1)  sd(s)  =  nr/2,  (2)  m  is  a  midpoint  of  s,  d(s),  (3)  PXQ  =  nri2  for 
exactly  one  point  XQ  of  seg(s,  m,  d(s)),  then  (p)  +  seg(s,  m,  d(s))  is  congruent- 
ly  imbeddable  in  E%,r. 

PROOF.  If  XQ  is  an  endpoint  of  seg(s,  m,  d(s)),  the  labelling  may  be 
selected  so  that  XQ  —  5.  Then  by  Lemma  5.1  points  p' ,  s',  m' ,  d(s'}  of 
Eztr  exist  such  that 

(/>,  s)  +  seg(w,  d(s))  **  (p't  s')  +  segK,  d(s')). 

Let  y  denote  any  interior  point  of  seg(s,  m).  The  procedure  of  Lemma  5.2 
may  be  applied  to  show  that  py  =  p'y',  and  continuity  of  the  metric 
gives  ps  =  p's' .  The  same  argument  applies  in  case  XQ  =^  s,  m,  d(s),  and 
the  remaining  case  (XQ  =  m)  is  immediate  from  Lemma  5.1. 
The  preceding  lemmas  establish  the  following  theorem. 

THEOREM  5.3.  Any  subset  of  Er(f)  consisting  of  the  union  of  a  point 
and  a  segment  joining  two  diametral  points  is  congruently  imbeddable  in 

£2,r- 

Let  Im  denote  the  strengthened  form  of  Postulate  I. 

THEOREM  5.4.  Postulates  Im,  II,  III,  IV,  V,  VI/,  VII  are  metric 
postulates  for  elliptic  n-space,  where  VI/  postulates  the  elliptic  feeble  four- 
point  property. 

PROOF.  It  suffices  to  show  that  VI/  implies  VI^.  If  />,  q,  s,  t  e  Er(f) 
with  qst  subsisting,  then  qt  —  nrj2  implies  (p)  +  seg(<7,  s,  t)  congruently 
imbeddable  in  E^j  (Theorem  5.3),  and  hence  so  are  p,  q,  s,  t.  In  case 
qt  ^=  nr/2,  then  by  Postulate  V,  Er(f)  contains  a  point  d(q)  such  that 
qtd(q)  and  qd(q]  =  nr/2.  Now  s  6  seg(</,  t)  C  seg(^,  t,  d(q)),  and  hence  the 
congruent  imbedding  in  E^tr  of  (p)  +  seg(^,  t,  d(q))  implies  that  p,  q,  s,  t 
arc  also  imbeddable  in  E^,r. 

6.  Metric  spaces  with  the  elliptic  c-feeble  four-point  property.  This 
section  is  devoted  to  showing  that  the  class  of  quadruples  assumed 
imbeddable  can  be  restricted  to  class  {^3},  a  proper  subclass  of  {Q?}.  Let 
Er(e  —  f)  denote  a  space  satisfying  Postulates  Im-V,  with  every  quadruple 
of  class  {$3}  congruently  imbeddable  in  E^r. 


140  LEONARD    M.    BLUMENTHAL 

It  is  easily  seen  that  Theorem  5.2,  together  with  the  Corollary  and 
Remark  following  it,  are  valid  under  the  weaker  assumption  made  in  this 
section. 

THEOREM  6.1.  The  union  of  any  segment  of  Er(e  —  /)  joining  a 
diametral  pointpair,  and  any  point  of  the  space  is  congruently  imbeddable 
in  E2,r> 

PROOF.  Let  p,  s,  d(s)  be  points  of  Er(s  —  /)  with  sd(s)  =  nr/2,  and  let 
X  —  [x  e  seg(s,  d(s))  \  px  =  nr/2].  Clearly,  X  is  a  closed  set. 

Case  I.  X  is  null.  Then  seg(s,  d(s))  admits  a  partition  into  equal  non- 
overlapping  subscgments  so  small  in  length  that  the  perimeter  of  each 
triple  of  points  contained  in  any  quadruple  formed  by  p  and  three  ad- 
jacent points  effecting  the  partition  is  less  than  nr.  An  argument  similar 
to  that  used  in  the  proof  of  Lemma  5.2  may  be  applied. 

Case  II.  X  =  (s)  or  A"  =  (d(s)).  Select  the  labelling  so  that  A^  =-  (s), 
and  let  t  be  an  interior  point  of  seg(s,  d(s}).  It  is  easily  seen  that  points 
/>',  /',  d(s'}  of  E2,r  exist  such  that 


(P)  +  seg(*,  d(s))  *  (P'}  +  seg(*',  d(s')9 

with  p,  t,  ds  fe*  p',  tf,  d(s').  Extend  scg(*',  d(s'))  to  s'  so  that  s't'd(s')  and 
s'd(s')  —  nr/2  subsist.  The  mapping  defined  by 

p  <-»  p',     seg(s,  t,  d(s))  t&  seg(s',  t',  d(s')) 

is  easily  seen  to  be  a  congruence. 

Case  III.  X  =  (t),  t<=  seg(s,  d(s)),  s  4=  t  =\=  d(s).  If  u,  v  e  seg(s,  d(s)) 
with  sut  and  tvd(s)  subsisting  and  ut  =  tv  <  8/2,  the  perimeter  of  every 
triple  in  the  quadruple  p,  u,  t,  v  is  less  than  nr  +  e.  Then  points  p'  '  ,  u',  v', 
t'  of  E2,r  exist  with  p,  u,  t,  v  &  p',  u',  t',  v'  .  We  have  seg(s,  t,  d(s))  ^ 
seg(s',  t',  d(s')),  where  the  latter  segment  of  E%%r  contains  seg(w',  t'  ,  v'). 
The  congruence 

(p)  +  seg(w,  t,  v)  **  (pf)  +  seg(w',  t',  v') 
is  easily  established  and  its  extension  to 

(p)  +  seg(s,  t,  d(s))  *,  (p'}  +  seg(s',  tf,  d(s')) 

is  proved  by  the  method  of  Lemma  5.2. 

Case  IV.     X  contains  at  least  two  points.  Then  every  point  of  seg(s,  d(s)) 
belongs  to  X  and  the  desired  conclusion  is  immediate. 


NEW  METRIC  POSTULATES  FOR  ELLIPTIC  W-SPACE  141 

THEOREM  6.2.  Postulates  Im,  II,  III,  IV,  V,  VI(e  -  /),  VII  are 
metric  postulates  for  elliptic  n-space,  where  VI  (e  —  /)  postulates  the  elliptic 
B- feeble  four-point  property. 

PROOF.  It  is  clear  from  Theorem  6.1  that  the  space  has  the  feeble 
four-point  property,  and  so  the  theorem  follows  from  Theorem  5.4. 

It  is  worth  remarking  that  the  argument  used  in  establishing  the  basic 
Theorem  6. 1  requires  e  to  be  positive.  It  seems  likely,  however,  that  the 
theorem  is  valid  if  e  be  replaced  by  zero;  that  is,  if  the  congruent  im- 
bedding in  #2,r  of  all  quadruples  pt  q,  s,  t  with  qs  =  st  —  ($)qt  and  the 
perimeter  of  each  triple  of  points  less  than  nr,  be  assumed. 

7.  Metric  spaces  with  the  elliptic  isosceles  weak  four-point  property  and 
the  elliptic  strong  two-triple  property.  This  section  is  concerned  with 
spaces  for  which  Postulates  Im-V  are  valid  and  such  that  all  quadruples 
of  classes  {$4}  and  {Q?}  are  congruently  imbeddable  in  E^.r-  Denote  the 
space  Er  (i.w.t.t.). 

The  imbecldability  in  E%tr  of  quadruples  of  class  {<2?}  suffices  to  esta- 
blish the  following  theorems  and  remarks.  7 

THEOREM  7.1.  Each  two  distinct  non-diametral  points  of  £r(i.w.t.t.) 
are  joined  by  a  unique  metric  segment. 

THEOREM  7.2.  //  p,  q  e  £r(i.w.t.t.),  0<pq<nrj2,  there  is  exactly 
one  point  d(p)  of  the  space  with  Pqd(p)  and  pd(p)  —  7ir/2. 

REMARK  1.     If  pqd(p),  pd(p)  =  nrj2,  there  is  a  unique  seg(/>,  q,  d(p)). 

REMARK  2.  The  relations  pqd(p))  qpd(°),  pd(p)  =  qd(q)  =  nr!2,  imply 
pd(q)d(p). 

THEOREM  7.3.  A  one-dimensional  subspace  of  Er  (i.w.t.t.)  is  congruent 
with  E\tT. 

On  the  other  hand,  the  proof  of  the  following  basic  theorem  makes  no 
direct  use  of  the  congruent  imbedding  of  quadruples  of  class  {(??},  but 
uses  only  the  imbedding  of  quadruples  of  class  {<?4}. 

THEOREM  7.4.  Let  p  be  any  point  and  Erl  any  one-dimensional  subspace 
of  Er  (i.w.t.t.).  The  E^.r  contains  a  point  p1  and  a  line  E\,r  such  that 

(P)  +  Erl  «  (pf)  +  £i,r. 
7  See  [1],  pp.  217-220. 


142  LEONARD    M.    BLUMENTHAL 

PROOF.  The  theorem  follows  from  Theorem  7.3  in  case  p  G  Erl,  and  is 
obviously  valid  if  px  =  nr/2  for  every  point  x  of  Erl.  It  may  be  assumed, 
therefore,  that  if  /  denotes  a  foot  of  p  on  Erl,  then  0  <  pf  <  nr/2.  Let 
a,  b  be  points  of  Erl  such  that  afb  and  af  =  fb  =  7rr/4. 


ASSERTION.     77w  ^oznl  /  is  the  only  foot  of  p  on  seg(a,  /,  b). 

If  there  were  two  additional  feet  /i,  /2  of  p  on  seg(#,  /,  6),  then  the 
E2,r  contains  points  /i',  /2',  /',  £'  with  /i,  /2,  /,  />  ^  fi,  /2',  /',  £'.  But 
/>'/!'  =  £'/2'  -=  £'/'  and  the  linearity  of  /i',  /2',  />'  imply  #  =  p'f'=nr!2, 
contrary  to  the  above. 

Suppose,  now,  that  /i  is  a  foot  of  p  on  seg(0,  /,  b),  f  ^=  /i,  and  denote  by 
g  the  midpoint  of  /,  f\.  From  the  congruent  imbedding  of  p,  /,  gt  f\  in 
E2,r  follows  pq  —  nrj2.  Assume  the  labelling  so  that  gfib  or  /i  —  b.  If  x  is 
interior  to  seg(«,  /),  then  ^>/  <  px  <  pq,  and  so  a  point  y  of  seg(/,  g) 
exists  such  that  px  —  Py.  Similarly,  a  point  z  of  seg(g,  /i)  exists  such  that 
px  =  py  —  pz.  Imbedding  p,  x,  y,  z  in  £2>r  yields  px  ~  py  —  pz  —  nr/2, 
and  imbedding  p,  /,  x}  y  in  £"2jr  gives  pf  =  nr'2.  Hence  the  Assertion  is 
proved. 

Select  seg(%,  y)  on  Erl  so  that  /  is  its  midpoint,  xy  <  nrj2,  and 
^,T  +  xy  +  py  <  nr.  If  />.v  —  Py,  let  the  labels  q,  s  replace  x,  y,  respective- 
ly, while  in  the  contrary  case,  label  so  that  px  <  py.  In  the  latter  event,  a 
point  z  of  seg(/,  y)  exists  such  that  pz  =  PX,  z  4=  /•  Now  let  q  be  the  label 
oi  '  x,  and  s  the  label  of  z.  Then  q,  /,  s,  p  ^  q',  f,  s',  p',  with  the  "primed" 
points  in  E^,r  and  q'f's'  holding.  The  proof  is  completed  by  showing  that 
the  correspondence  defined  by 


is  a  congruence,  where  the  congruence  exhibited  in  the  correspondence 
is  an  extension  of  g,  f,S£&g',f',s'.Iixe  Erl(q,  /,  s),  its  correspondent  in 
Ei,r(<If>  I''  s')  wiU  be  Denoted  by  "primes". 

If  g  e  scg(<7,  /,  s)  it  is  easily  seen  that  p,  q,  g,  s  ^  p',  q',  g',  s',  and 
consequently 

(*)  (P)  +  seg(?,  /,  s)  *,  (p')  +  segfe',  /',  s'). 

It  follows  that  /'  is  the  foot  of  p'  on  seg(?',  /',  s'),  and  q'f  =  qf  =  fs=f's'. 
In  the  triangles  (p',  /',  s')  and  (pf,  f,  q')  the  angles  <£  p'f's'  and  <£  p'f'q' 
are  right  angles.  Let  x,  y  be  points  of  Erl  such  that  fqx  subsists,  qx  < 
min(aq,  fq),  xfy  holds,  with  fy  <  fx,  and  px  =  Py. 

Then  pt  x,ytff^  p"  ,  x",  y"  ,  f",  with  the  latter  quadruple  in  £2ff. 


NEW  METRIC  POSTULATES  FOR  ELLIPTIC  n-SPACE  143 

ASSERTION.     The  point  /"  is  the  foot  of  p"  on  £i,r(*",  /",  y"). 

From  p"x"  =  px  -  py  =  p"y"t  the  foot  of  p"  on  Ei,r(x",  /",  y")  is 
either  the  midpoint  m"  of  seg(*",  y")  or  its  diametral  point  d(m")  on  that 
line.  The  latter  alternative  is  easily  seen  to  be  impossible. 

Now  the  midpoint  m  of  seg(#,  y}  is  a  point  of  seg(</,  s)  ,and  so  pm<nr/2. 
From  the  congruent  imbedding  in  E^j  of  p,  x,  m,  y  it  is  seen  that  m"  and 
/"  coincide,  and  the  Assertion  is  established. 

The  equalities  xf  =  x"f"  =  f'y"  —  fy  follow,  and  letting  x',  y'  denote 
the  points  on  E\tT(q' ,  f,  s')  corresponding  to  x",  y",  respectively,  yields 
x"f"  =  x'f  =  fyf  =  /"/',  and  p"f"  =  pf  =  p'f.  The  two  elliptic  right 
triangles  (p" ,  f",  y"},  (pf,  f',  y')  are  congruent,  and  so  py  ~  p"y"  —  P'y'  = 
p'x'.  Also  p"y"  —  py  =  px,  and  consequently  px  =  p'x'.  Thus  congru- 
ence (*)  can  be  extended  to 

(P)  +  seg(*.  /,  y)  «  (p')  +  seg(*',  /',  y'), 

and  since  PZ  <  nr/2,  z  E  seg(#,  y),  repetition  of  the  procedure  yields 
(**)  (P)  +  seg(a,  /,  b)  **  (p'}  +  seg(a',  /',  b'). 

Let  d(f)  denote  the  point  of  Erl(a,  /,  b)  that  is  diametral  to  /,  and  sup- 
pose g  is  an  interior  point  of  seg(0,  d(f)).  vSince  agb  and  pa  =  pb  hold, 
p,  a,  g,  b  &  p",  a",  g",  b"t  points  of  /i2fr,  and  p" ,  a",  b"  ^  p,  a,  b  ^ 
p' ',  a',  6'.  Since  a'b'  =  jrr/2,  the  two  triples  are  superposable,  and  a  motion 
gives  p,  a,  g,  b  ^  p' ,  a',  g*,  b',  with  g*  on  E\j(a' ,  f,  b')  and  a,  6,  g*  ^ 
a',  ft',  gr.  Then  either  g'  =  g*  or  g*  is  the  reflection  of  g'  in  a'.  The  latter 
case  is  easily  seen  to  be  impossible.  Hence  p,  a,  g,  b  ^  p',  a',  g',  6'  and 

PS  =  £Y- 

In  a  similar  manner  it  is  seen  that  pv  =  p'v',  for  v'  an  interior  element 

of  seg(6,  d(f))t  and  the  theorem  is  proved. 

It  follows  at  once  that  £r(i.w.t.t.)  has  the  elliptic  weak  four-point 
property  and  the  following  axiomatization  results. 

THEOREM  7.5.  Postulates  lm,  II,  III,  IV,  V,  VI  (i.w.t.t.),  VII  are 
metric  postulates  for  elliptic  n-space,  where  Postulate  VI  (i.w.t.t.)  asserts 
that  every  quadruple  of  classes  {Q$}  and  {Qi}  are  congruently  imbeddable  in 

E2,r. 

8.  Metric  spaces  with  quadruples  of  classes  {Qs},  {Qe},  {Q?}  imbeddable  in 

Zs2,r.  By  virtue  of  the  strong  two-triple  property  (the  imbedding  of 
quadruples  of  class  {(??}),  Theorems  7.1,  7.2,  7.3  of  the  preceding  section 
are  valid  here.  Let  Er*  denote  any  metric  space  satisfying  the  demands  of 


144  LEONARD    M.    BLUMENTHAL 

Postulates  II-V,  and  such  that  each  quadruple  of  its  points  belonging  to 
{Qs}>  {Q$}>  or  {(M  i§  congruently  imbeddable  in  E%%r. 

THEOREM  8. 1 .  The  set  sum  of  any  point  and  any  one-dimensional  sub- 
space  of  Er*  is  congruently  imbeddable  in  E^r. 

PROOF.  It  suffices  to  consider  the  case  of  p  e  Er*,  Erl  C  Er*,  f  a  foot 
of  p  on  Erl,  and  0  <  pf  <  nr/2.  It  is  not  difficult  to  show  that  (1)  /  is 
unique  and  (2)  Erl  contains  at  most  one  point  g  with  pg  =  nrj2.  If  such  a 
point  g  exists,  and  /*,  g*  denote  those  points  of  Erl  diametral  to  /,  g, 
respectively,  let  x\t  x%  be  points  of  seg(/,  g),  seg(/,  g*,  /*),  respectively, 
such  that  fx\  <  fx%.  It  may  be  shown  that  px\  <  px%. 

Let  a,  b  be  the  two  midpoints  of  /,  /*  on  Erl,  and  suppose  g  G  seg(/,  a,  /*). 
Choose  points  x,  y  of  Erl  so  that  #  e  seg(a,  /),  y  e  seg(/,  b),  2xy  <  fg, 
px  =  pyt  and  A*(p,  x,  y)  <  0.  Assume  that  the  midpoint  m  of  x,  y  is 
distinct  from  /.  Points  s,  t  of  Erl  exist  such  that  s  e  seg(#,  /),  t  E  seg(/,  y), 
ps  =  pt  and  st  =  2-ty.  Then  p,  s,  /,  y  **  p" ,  s" }  t" ,  y",  points  of  E2,r.  Let 
z  be  a  point  of  Erl  such  that  ts  =  2-sz.  Then  it  is  easily  seen  that 
p,  s,  ty  z  a&  p" y  s",  t",  z",  and  p"z"  —  p"y".  Since  p"y"  —  py  =  px  and 
px  =  pz,  with  z  interior  to  seg(/,  g),  it  follows  that  x  =  z.  Hence  xs  ~  ty, 
and  m  is  the  midpoint  of  s,  t.  Repeating  this  procedure  a  finite  number  of 
times  establishes  m  as  the  midpoint  of  a  pair  of  points  for  which  it  is  not  a 
betweenpoint.  As  a  result  of  this  contradiction,  we  conclude  that  pairs  of 
points  in  seg(,v,  y)  having  /  as  their  midpoint  are  equidistant  from  p. 

Select  points  a',  /',  b',  p'  of  E  2,  r  such  that  Er1(a,  /,  b)  &  E\,r(a' ,  f,  b')  is 
an  extension  of  a,  /,  b  z&  a',  /',  b' ' ,  and  /'  is  the  foot  of  p'  on  Ei>r(a',  f' ,  b'}, 
with  p'f  =  pf.  The  proof  is  completed  by  showing  that  the  mapping 

p~p',     ETi(a,  f,  b)  *,  £lff(«'.  /',  b') 
is  a  congruence. 

THEOREM  8.2.  Postulates  Im,  II,  III,  IV,  V,  VI*,  VII  are  metric 
postulates  for  elliptic  n-space,  where  Postulate  VI*  asserts  the  congruent 
imbedding  in  #2,r  of  all  point  quadruples  of  classes  {Qs},  {Qs},  {(??}• 

9.  A  fundamental  unsolved  problem.  It  is  known  that  every  semimetric 
space  is  congruently  imbeddable  in  the  E^r  whenever  each  7  of  its  points 
are  so  imbeddable  [7]  8.  This  is  stated  by  saying  that  the  elliptic  plane 
has  congruences  indices  {7,  0}  with  respect  to  the  class  of  semimetric 

8  Sec  also  [4],  in  which  congruence  indices  {8,0}  are  proved. 


NEW  METRIC  POSTULATES  FOR  ELLIPTIC  W-SPACE  145 

spaces.  Since  the  E^,r  contains  an  equilateral  sextuple,  the  congruence 
indices  {7,  0}  are  the  best  (that  is,  for  no  integers  m,  k  (m  <  7)  are  indices 
{m,  k}  valid).  This  result,  together  with  Theorem  3.1,  completely  solves 
the  congruent  imbedding  problem  for  E^r  and  hence  provides  a  metric 
axiomatization  for  the  class  of  subsets  of  E^,r-  Metric  postulates  for  the 
7i2,r  itself  are  obtained  by  adjoining  any  metric  properties  that  serve  to 
distinguish  the  elliptic  plane  among  its  subsets,  though,  as  observed 
earlier  in  this  paper,  this  approach  to  the  metric  characterization  of  a 
space  is  likely  to  result  in  a  redundant  set  of  postulates. 

Perhaps  the  most  important  unsolved  problem  suggested  by  this  manner 
of  studying  elliptic  geometry  is  the  determination  of  congruence  indices  for 
En,r  when  n  >  2.  The  problem  for  the  E%tr  is  quite  difficult,  and  the 
methods  employed  there  seem  incapable  of  extension  even  to  E^r. 
Apparently  an  entirely  different  approach  is  needed.  At  the  present  time 
not  even  any  preliminary  results  concerning  the  problem  for  general 
dimension  have  been  obtained. 


Bibliography 

[1J  BLUMKNTHAL,  L.  M.,  Theory  and  applications  of  distance  geometry.  The  Claren- 
don Press,  Oxford  1953,  XI  f347  pp. 

[2] ,  An  extension  of  a  theorem  of  Jordan  and  von  Neumann.  Pacific  Journal  of 

Mathematics,  vol.  5  (1955),  pp.  161-167. 

[3]  ,  Metric  characterization  of  elliptic  space.  Transactions  American  Mathe- 
matical Society,  vol.  59  (1946),  pp.  381  400. 

[4] ,  and  KELLV,  L.  M.,  New  metric-theoretic  properties  of  elliptic  space.  Rcvista 

de  la  Universidad  Nacional  cle  Tucuman,  vol.  7  (1949),  pp.  81-107. 

[5]  HANKINS,  J.  D.,  Metric  characterizations  of  elliptic  n-space.  University  of  Mis- 
souri doctoral  dissertation,  1954. 

[6J  MENGER,  K.,  Untersuchungen  iiber  allgemeine  Metrik.  Mathematische  Annalen, 
vol.  100  (1928),  pp.  75-163. 

[7]  SKIDEL,  J.,  De  Congrnentie-orde  van  het  elliptische  vlak.  Thesis,  University  of 
Leiden,  1948,  iv  -f  71  pp. 

[8]  WILSON,  W.  A.,  A  relation  between  metric  and  euclidean  spaces.  American 
Journal  of  Mathematics,  vol.  54  (1932),  pp.  505-517. 


Symposium  on  the  Axiomatic  Method 


AXIOMS  FOR  GEODESICS  AND  THEIR  IMPLICATIONS 

HERBERT  BUSEMANN 

University  of  Southern  California,  Los  Angeles,  California,    U.S.A. 

The  foundations  of  geometry  are  principally  concerned  with  elementary 
geometry  and  in  particular  with  the  role  of  continuity.  Although  our 
intuition  relies  on  continuity,  very  large  sections  of  euclidean  and  non- 
euclidean  geometry  prove  valid  without  continuity  hypotheses. 

This  lecture  deals  with  the  foundations  of  metric  differential  geometry. 
Continuity  is  taken  for  granted  and  the  interest  centers  on  the  question 
to  which  extent  differentiability  is  necessary.  In  order  to  delineate  the  subject 
more  clearly  we  emphasize  that  we  do  not  mean  results  like  that  of 
Wald  [15],  who  characterizes  Riemannian  surfaces  with  a  continuous 
Gauss  curvature  among  metric  spaces,  because  his  point  is  not  the 
weakening  or  omission  of  differentiability  properties  but  their  replace- 
ment by  other  limit  processes. 

Two  major  advances  have  been  made  in  the  indicated  direction,  one 
—  our  principal  subject  —  is  due  to  the  author  and  concerns,  roughly, 
the  intrinsic  geometry  in  the  large  of  not  necessarily  Riemannian  spaces', 
most  of  this  theory  can  be  found  in  the  book  [3].  The  second  is  the  work 
of  A.  D.  Alexandrov  and  deals  with  surfaces  either  in  E3  or  with  an  abstract 
intrinsic  Riemannian  metric.  Much  of  this  material  is  found  in  his  book 
[1],  a  brief  survey  in  [6].  In  both  theories  the  tools  created  in  order  to  do 
without  differentiability  assumptions  proved  in  many  instances  far 
superior  to  the  classical,  in  fact  they  yield  a  number  of  results  which 
remain  inaccessible  to  the  traditional  methods,  even  when  smoothness  is 
granted. 

It  also  appears  that  a  frequently  followed  procedure,  which  works,  as  it 
were,  from  the  top  down  by  reducing  differentiability  hypotheses  in 
existing  proofs,  has,  in  general,  very  little  chance  of  producing  final 
results. 

The  axioms  for  a  G-space,  [3,  Chapter  I].  Since  we  are  interested  in 
metric  differential  geometry  our  first  axiom  is : 

I.     The  space  is  metric. 

146 


AXIOMS  FORGEODESICS  AND  THEIR  IMPLICATIONS  147 

We  call  the  space  R,  denote  points  by  small  Roman  letters,  the  distance 
from  x  to  y  by  xy.  Since  the  concept  of  metric  space  has  been  generalized 
in  various  ways  we  mention  explicitly  that  we  require  the  standard 
properties:  xx  =  0,  xy  =  yx  >  0  if  %  ^  y,  and  the  triangle  inequality 
xy  +  yz  ^  xz.  But  large  parts  of  the  theory  hold  without  the  symmetry 
condition  xy  =  yx. 

The  relations  x  ^  y,  y  ^  z  and  xy  +  yz  ~  xy  will  be  written  briefly  as 
(xyz).  A  set  M  in  R  is  bounded  if  xy  <  p  for  a  suitable  /?  and  all  x,  y  in  M. 

Our  second  axiom  is  the  validity  of  the  Bolzano  Weierstrass  theorem: 

II.  A  bounded  infinite  set  has  an  accumulation  point. 

In  conjunction  with  the  following  axioms,  II  entails  that  the  space  is 
complete -and  behaves  in  all  essential  respects  like  a  finite-dimensional 
space.  Whether  the  axioms  actually  imply  finite  dimensionality  is  an 
open  question. 

The  third  axiom  guarantees  that  the  metric  is  intrinsic.  It  was  intro- 
duced by  Menger  as  convexity  of  a  metric  space: 

III.  //  x  =£  z,  then  a  point  y  with  (xyz)  exists. 

It  follows  from  I,  II,  III  that  any  two  points  x,  y  can  be  connected  by  a 
segment  T(x,  y),  i.e.,  a  set  isometric  to  an  interval  [a,  /?]  of  the  real  £-axis. 
T(x,  y)  can  therefore  be  represented  in  the  form  z(t),  a<J</?=a+#;y  with 

(1)  z(h}z(h)  =  \h  -  t2\, 

and  z(a)  —  x,  z((i)  =  y. 

These  axioms  are  satisfied,  for  example,  by  a  closed  convex  subset  of  a 
euclidean  space.  However,  we  aim  at  geometries  which  cannot  be  ex- 
tended without  increasing  the  dimension.  Obviously  some  form  of 
prolongability  is  necessary.  Requiring  that  any  segment  can  be  prolonged 
would  be  too  strong,  it  would  eliminate  even  the  ordinary  spherical 
metric.  If  S(p,  p)  denotes  the  set  of  points  x  with  px  <  p,  we  postulate: 

IV.  Every  point  p  has  a  neighborhood  S(p,  pp),  pp  >  0,  such  that  for 
any  two  distinct  points  x,  y  in  S(p,  pp)  a  point  z  with  (xyz)  exists. 

The  generality  of  the  function  pp  is  deceiving;  the  axiom  furnishes 
a  function  p(p)  >  0  satisfying  IV  and  the  Lipschitz  condition 

\P(P)-P(9)\  <P9- 
The  four  axioms  yield  geodesies.  A  geodesic  is  a  locally  isometric  image 

of  the  real  2-axis,  precisely:  it  can  be  represented  in  the  form  z(t), 


148  HERBERT   BUSEMANN 

—  oo  <  t  <  oo,  and  there  is  a  positive  function  s(t)  such  that  (1)  holds 
for  \ti  —  t\  <*  e(t),  i  —  I,  2.  Thus,  the  geodesies  on  an  ordinary  cylinder 
are  either  entire  helices,  entire  straight  lines  or  circles  traversed  infinitely 
often. 

Geodesies  exist  in  the  following  sense:  a  function  z(t)  satisfying  (1)  in  an 
interval  a  <  /  <  ft,  a  <  ft,  can  be  extended  to  all  real  t,  so  that  it  re- 
presents a  geodesic.  This  is  the  analogue  to  the  indefinite  continuation  of 
a  line  element  into  a  geodesic  in  the  classical  case. 

Axioms  I  to  IV  contain  no  uniqueness  properties.  In  an  (xi,  #2) -plane 
metrized  by  xy  =  \xi  —  y\\  +  \xz  —  yz\  any  curve  z(s)  =  (zi(s),  £2(5)), 
a  <  s  <  b,  for  which  both  zi(s)  and  zz(s)  are  monotone  is  a  segment  from 
z (a)  to  z(b).  We  observe  that  in  the  classical  case  a  segment  can  be  pro- 
longed by  a  given  amount  in  at  most  one  way  and  therefore  postulate : 

V.     //  (xyzi),  (xyz%)  and  yz\  —  yz2,  then  z\  =  zz- 

The  five  axioms  guarantee  that  the  above  extension  z(t]  of  a  segment  to 
a  geodesic  is  unique.  Moreover,  if  (xyz)  then  T(x,  y)  and  T(y,  z)  are  unique 
(because  two  different  T(y,  z)  would  yield  two  different  prolongations  of 
T(x,  y).).  In  particular,  T(x,  y)  is  unique  for  x,  y  e  S(p,  pp),  so  that  the 
local  uniqueness  of  the  shortest  connection,  which  is  so  important  for  many 
investigations  in  differential  geometry,  need  not  be  explicitly  stipulated. 

The  spaces  satisfying  the  five  axioms  are  called  G-spaces,  the  G  alluding 
to  geodesic. 

There  are  two  particularly  simple  types  of  geodesies,  namely  those 
which  satisfy  (1)  for  arbitrary  tit  t2  and  are  therefore  isometric  to  the 
entire  real  axis;  they  are  called  straight  lines.  The  others  arc  the  so- 
called  great  circles  which  are  isometric  to  ordinary  circles.  A  representation 
z(t)  of  a  great  circle  of  length  ft  is  characterized  by 

z(ti)z(h)  =  min  \h  -t2  +  vft\. 

\v\      0,1,2,... 

The  cylinder  shows  that  straight  lines,  great  circles  and  geodesies  which 
are  neither  may  occur  in  one  space.  When  IV  holds  in  the  large,  or  z  with 
(xyz)  exists  for  any  two  distinct  points  x,  y,  then  all  geodesies  are  straight 
lines  (and  conversely),  and  the  space  is  called  straight. 

The  lowest  dimensional  G-spaces  are  uninteresting.  A  0-dimensional 
space  is  obviously  a  point  and  a  one-dimensional  G-space  is  a  straight  line 
or  a  great  circle.  The  two-dimensional  G-spaces  can  be  proved  to  be 
topological  manifolds;  the  corresponding  problem  for  higher  dimensions 
is  open. 


AXIOMS  FOR  GKODESICS  AND  THEIR  IMPLICATIONS  149 

It  is  important  to  notice  that  the  axioms  comprise  the  Finsler  spaces, 
where  the  line  clement  has  the  form  ds  —  f(x\,  .  .  . ,  xn\  dx\,  .  .  . ,  dxn)  = 
f(x,  dx),  and  f(x,  dx)  satisfies  certain  standard  conditions  (see  [3,  Section 
15])  but  need  not  be  quadratic  or  Riemannian  ds2  =  £gik(x)dxidxk.  The 
analytical  methods  often  become  highly  involved  for  Finsler  spaces.  This 
explains  why  the  limitation  of  the  hypotheses  inherent  to  the  axiomatic 
approach  leads  in  this  case  to  improved  methods,  which  have  the  ad- 
ditional appeal  of  effecting  a  synthesis  of  differential  geometry,  topology, 
the  calculus  of  variations,  the  foundations  of  geometry,  and  convex  body 
theory. 

Spaces  with  negative  curvature,  [3,  Chapter  V].  Tt  is  impossible  to 
outline  the  whole  theory  of  G-spaces  in  the  space  available  here.  We 
therefore4  restrict  ourselves  to  giving  a  few  typical  results  and  discuss  in 
greater  detail  only  the  theory  of  parallels  which  is  more  closely  related  to 
the  remaining  geometric  topics  of  this  symposium. 

Hadamard  discovered  in  [8]  that  the  surfaces  with  negative  curvature 
have  many  beautiful  properties.  For  Riemann  spaces  his  results  were  ex- 
tended by  others  in  various  directions.  The  very  concept  of  curvature 
seems  to  imply  notions  of  differentiability.  However,  each  of  the  following 
two  properties  proves  in  the  Riemannian  case  to  be  equivalent  to  non- 
positive  (negative)  curvature: 

For  each  point  p  there  is  a  positive  ov,  0  <.  dp  ^  pp  such  that 

(a)  if  a,  b,  c  lie  in  S(p,  o^)  but  not  on  a  segment,  and  b' ,  c'  are  the  mid- 
points of  T(a,  b)  and  T(a,  c),  then  2V c'  <  2bc  (2b'c'  <  2bc) ; 

(b)  if  C(T,  f)  denotes  the  set  xT  <  e  where  T  is  a  segment  and  C(T,  E)  < 
S(p,  dp)  then  C(T,  e)  is  convex  (strictly  convex}. 

Convexity  is  defined  in  the  usual  way  by  means  of  the  T(x,  y). 

In  Finsler  spaces,  hence  also  in  G-spaces,  (a)  is  stronger  than  (b).  The 
geometry  discovered  by  Hilbert  [9,  Anhang  I]  (which  corresponds  to  the 
Klein  model  of  hyperbolic  geometry  with  a  convex  curve  replacing  the 
ellipse  as  absolute  locus)  furnishes  an  example  where  (b)  holds  but  not  (a). 
In  fact,  a  Hilbert  geometry  satisfying  (a)  is  hyperbolic  (see  Kelly  and 
Strauss  [11]).  The  condition  (b)  was  introduced  by  Pedersen  [12]. 

Practically  all  of  Hadamard 's  and  the  later  results  on  Riemann  spaces 
with  non-positive  or  negative  curvature  hold  for  G-spaces  with  the  property 
of  domain  invariance  satisfying  (a),  and  many  are  valid  when  (b)  holds.  In 
particular,  (b)  implies  that  the  universal  covering  space  of  the  given 


150  HERBERT    BUSEMANN 

space  R  is  straight.  Consequently,  for  two  given  points  of  R,  the  geodesic 
connection  is  unique  within  a  given  homotopy  class.  Moreover,  when  the 
C(T,  e)  are  strictly  convex,  then  there  is  at  most  one  closed  geodesic 
within  a  given  class  of  freely  homotopic  curves;  if  compact,  the  space 
cannot  have  an  abelian  fundamental  group  and  possesses  only  a  finite 
number  of  motions  (isometries  of  the  space  on  itself).  The  latter  result  is, 
for  Riemann  spaces,  contained  in  Bochner  [2]  and  for  the  general  case  in 
[7].  The  theory  of  parallels  (see  below)  requires  (a)  instead  of  (b). 

Characterizations  of  the  elementary  spaces,  [3,  Chapter  VI].  For  brevity 
we  call  the  euclidean,  hyperbolic,  and  spherical  spaces  elementary.  The 
bisector  B(a,  a')  of  two  distinct  points  a,  a'  in  a  G-space  is  the  locus  of  the 
points  x  equidistant  from  a  and  a'  \  that  is,  ax  =  a'x.  The  elementary 
spaces,  with  the  exception  of  the  1  -sphere,  are  characterized  by  the  fact  that 
their  bisectors  contain  with  any  two  points  x,  y  at  least  one  segment  T(x,  y). 
The  principal  theorem  is  the  following  local  version: 

(2)  Let  0  <  6  <,  pp  and  assume  that  for  any  five  distinct  points  a,  a',  b,  c,  x  in 
S(p,  d)  the  relations  ab  =  a'b,  ac  —  a'c  and  (bxc)  entail  ax  =  a'x.  Then 
S(p,  6)  is  isometric  to  an  open  sphere  of  radius  d  in  an  elementary  space. 

The  hypothesis  means,  of  course,  that  b,  c  e  B(a,  a')  implies  x  e  B(a,  a'). 
Whereas  the  proofs  for  the  results  on  spaces  with  non-positive  curvature 
are  not  essentially  longer  than  they  would  be  under  differentiability 
hypotheses,  such  hypotheses  would  very  materially  shorten  the  long  proof 
of  (2)  in  [3,  Sections  46,  47]. 

Various  well-known  theorems  are  corollaries  of  (2)  ;  examples  are  the 
following  global  and  local  answers  to  the  Helmholtz-Lie  Problem: 


If  for  two  given  isometric  triples  a\t  #2,  #3  and  a\  ',  az,  a%  (i.e., 
of  a  G-space  R  a  motion  of  R  exists  taking  at  into  aj  t  i  —  1  ,  2,  3,  then  R  is 
elementary. 

If  every  point  of  a  G-space  R  has  a  neighborhood  S(p,  6),  0  <  d  <  pp, 
such  that  for  any  four  points  a\,  a^,  a\  ',  a%  in  S(p,  d)  with  pa\  =  pa\  ', 
pa%  =  pad  and  a\a%  =  a\a<&  a  motion  of  S(p,  d)  exists  which  takes  at  into 
at,i=  1,  2,  then  the  universal  covering  space  of  R  is  elementary. 

By  using  deeper  results  from  the  modern  theory  of  topological  and  Lie 
groups,  Wang  [16]  and  Tits  [14]  succeeded  to  determine  all  spaces  with 
the  property  that  for  any  two  pairs  a\t  a%  and  a\  ,  a%  with  a\a^  =  a\a^  a 


AXIOMS  FOR  GEODESICS  AND  THEIR  IMPLICATIONS  151 

motion  exists  taking  at  into  a/,  i  =  1 ,  2.  If  the  space  has  an  odd  dimension 
then  it  is  either  elementary  or  elliptic ;  for  even  dimensions  greater  than  2 
there  are  other  solutions. 

Inverse  problems  in  the  large  for  surfaces,  [3,  Sections  1 1,  33],  [4],  [13]. 
In  inverse  problems  of  the  calculus  of  variations  one  gives  a  set  of  curves 
and  asks  whether  they  occur  as  the  extremals  of  a  variational  problem. 
The  local  inverse  problem  in  two  dimensions  was  solved  by  Darboux,  but 
his  method  provides  no  answers  in  the  large.  Inverse  problems  in  the  large 
cannot  be  treated  by  one  method,  they  differ  depending  on  the  topo- 
logical  structure  of  the  surface,  and  are  inaccessible  to  the  traditional 
approach. 

Three  of  these  problems  have  been  solved  with  the  present  methods. 
We  mention  first  that  a  G-space,  in  which  the  geodesic  through  two  (distinct) 
points  is  unique,  is  either  straight,  or  all  its  geodesies  are  great  circles  of  the 
same  length  j$  (sec  [3,  Theorem  (31,  2)]).  In  the  latter  case  we  say  that  the 
space  is  of  the  elliptic  type.  If  its  dimension  exceeds  1 ,  then  it  has  a  two- 
sheeted  universal  covering  space  which  shares  with  the  sphere  the 
properties  that  all  geodesies  are  great  circles  of  the  same  length  2/?,  and 
that  all  geodesies  that  pass  through  a  given  point  meet  again  at  a  second 
point. 

A  two-dimensional  G-space  R,  in  which  the  geodesic  through  two 
points  is  unique,  is  therefore  cither  homeomorphic  to  the  euclidean  plane 
E2  or  to  the  projective  plane  P2.  If  in  the  case  of  E2  a  euclidean  metric 
e(x,  y)  is  introduced,  then  the  geodesies  of  R  form  a  system  N  of  curves 
which  have,  in  terms  of  e(x,  y),  the  following  two  properties: 

1)  Each  curve  in  N  is  representable  in  the  form  z(t),  —  oo  <  t  <  oo,  with 
z(h)  J--  zfa)  for  ti  ^  t%  and  e(z(Q)),  z(t))  ->  oo  for  \t\  ->  oo. 

2)  There  is  exactly  one  curve  of  N  through  two  distinct  points. 

The  answer  to  the  corresponding  inverse  problem  is: 

//  a  system  N  of  curves  in  E2  with  the  properties  1),  2)  is  given,  then  the 
plane  can  be  remetrized  as  a  G-space  with  the  curves  in  N  as  geodesies. 

It  will  become  clear  from  examples  later  that  the  problem  of  deter- 
mining all  metrics  with  the  curves  in  N  as  geodesies  has  too  many  so- 
lutions to  be  interesting. 

The  inverse  problem  P2  was  solved  by  Skornyakov  [13],  a  simpler 
proof  is  found  in  [4] : 


152  HERBERT    BUSEMANN 

In  P2  let  a  system  N'  of  curves  homeomorphic  to  a  circle  be  given  and  such 
that  there  is  exactly  one  curve  of  N'  through  2  given  distinct  points.  Then 
P2  may  be  metrized  as  a  G-space  with  the  curves  in  N'  as  geodesies. 

The  third  problem  solved  with  these  methods  is  that  of  a  torus  with  a 
straight  universal  covering  space.  It  differs  from  the  preceding  problems  in 
that  there  are  non-obvious  necessary  conditions.  In  the  plane  as  the 
universal  covering  space  of  the  torus  we  can  introduce  an  auxiliary 
cuclidean  metric  e(x,  y)  and  cartesian  coordinates  x\t  %%  such  that  the 
covering  transformations  are  the  translations  T(WI,  m%) : 

%i   =  x\  -f  m\,     %2   —  #2  +  *«2,     m\,  wo  integers. 

To  the  geodesies  on  the  torus  there  then  corresponds  in  the  plane  a 
system  N  of  curves  which  satisfies  the  conditions  1),  2)  above  and  in 
addition  the  following: 

3)  N  goes  into  itself  under  the  T(mi,  m%). 

4)  //  a  curve  in  N  passes  through  q  and  qT(m\t  m%)  then  it  also  passes 
through  the  points  qT(vm\t  vm^),  \v\  =  1,2,  .... 

5)  N  satisfies  the  parallel  axiom  (on  its  usual  form,  see  below). 

Whereas  it  is  not  hard  to  establish  4),  the  proof  of  5)  is  far  from  obvious 
and  represents,  as  far  as  the  author  is  aware,  the  only  instance  in  the 
literature  where  the  validity  of  the  parallel  axiom  appears  as  a  non- 
trivial  theorem. 

//  a  system  N  of  curves  in  the  plane  is  given  which  has  the  properties  1) 
to  5),  then  the  plane  can  be  metrized  as  a  G-space  with  the  curves  in  N  as 
geodesies,  where  the  metric  is  invariant  under  the  T(m\,  m$  and  thus  yields 
a  metrization  of  the  torus. 

The  curves  in  N  need  not  satisfy  Desargues'  Theorem,  even  when  N  and 
the  metric  arc  invariant  under  T(mi,  r),  where  r  is  an  arbitrary  real  num- 
ber. These  facts  exhibit  very  clearly  the  great  generality  of  Finsler  spaces 
as  compared  to  Riemann  spaces:  the  only  Riemannian  metrizations  of  the 
torus  such  that  the  universal  covering  space  is  straight  are  euclidean  (see  E. 
Hopf  [10]). 

The  theory  of  parallels,  [3,  Chapter  III].  In  the  foundations  of  geometry 
the  congruence  axioms,  parallel  axiom  (euclidean  or  hyperbolic),  and 
the  continuity  axioms  usually  appear  in  this  order.  The  present  theory 


AXIOMS  FOR  GEODESICS  AND  THEIR  IMPLICATIONS  153 

suggests  the  study  of  parallelism  on  the  basis  of  continuity  without 
congruence  or  mobility  axioms. 

In  a  straight  space  denote  by  G(a,  b),  a  ^  b,  the  geodesic,  briefly  line, 
through  a  and  b  and  by  G+(a,  b)  the  same  line  with  the  orientation  in 
which  b  follows  a.  Let  G+  be  any  oriented  line,  p  any  point.  Then  G+(p,  x) 
converges  to  a  line  A+t  when  x  traverses  G+  in  the  positive  direction.  The 
convergence  of  G]  (p}  x)  is  trivial  in  the  plane,  but  not  in  higher  dimensions. 
A  +  is  called  the  asymptote  to  Gf  through  p.  It  is  independent  of  p  in  the 
sense  that  for  q  e  A +  the  line  G+(q,  x)  also  tends  to  A+. 

Denote  by  A~t  G~  the  opposite  orientations  of  the  lines  A,  G  carrying 
A  '",  G1 .  If  A  '•  and  A~  are  asymptotes  to  G+  and  G~  respectively,  then  we 
call  A  parallel  to  G.  These  definitions  suggest  investigating  the  following 
properties: 

SYMMETRY:  //  A  +  is  an  asymptote  to  G1,  then  G+  is  an  asymptote  to  A+. 
If  A  is  parallel  to  G,  then  G  is  parallel  to  A. 

TRANSITIVITY:  //  A  H  is  asymptote  to  B+,  and  B+  is  to  O,  then  so  is 
A+  to  C+. 

It  is  very  easily  seen  that  the  transitivity  of  the  asymptote  relation  implies 
its  symmetry.  The  converse  holds  in  the  plane,  but  it  is  not  known  whether 
this  extends  to  higher  dimensions. 

Even  in  a  plane  the  asymptotic  relation  is  not  always  symmetric.  In  an 
(.v,  y) -plane  let  H,  HI  be  the  branches  x  <  0  of  the  hyperbolas  xy  =  —  1 
and  xy  —  1  respectively.  Let  H~,  H\~  be  their  orientations  corresponding 
to  decreasing  x.  The  system  TV  consists  of  all  curves  obtainable  by  trans- 
lations from  77,  of  the  lines  y  =  mx  +  b,  m  <  0  and  the  lines  x  —  const. 
The  system  NI  consists  of  the  curves  obtainable  by  translations  from  H 
or  HI  and  of  the  lines  x  =  const.,  y  =  const.  Each  of  the  systems  satisfies 
the  conditions  l)-2)  above  and  hence  serves  as  system  of  geodesies  for  a 
6-space. 

Denote  by  Y+  and  Y  (_i  the  lines  y  =  0,  y  =  —  1  with  the  orientations 
corresponding  to  increasing  y.  In  both  systems  N,  NI,  Y '  is  an  asymptote 
to  Y+_i  and  so  is  Y~  to  Y~_i ;  thus,  Y  is  parallel  to  Y-I.  In  N  the  line 
Y+-i  is  not  the  asymptote  to  Y+  through  (—  1 ,  1 ),  but  H+  is,  whereas  Y~-\  is 
an  asymptote  to  Y~.  In  the  system  NI  neither  Y+-I  is  an  asymptote  to  Y+ 
nor  is  Y~_i  to  Y". 

In  the  plane  the  parallel  axiom  in  its  usual  form  (namely,  if  p  $  G  then 
there  is  exactly  one  line  A  through  p  which  does  not  intersect  G)  is  equivalent 


154  HERBERT    BUSEMANN 

to  postulating  that  for  any  p  and  G,  if  A+  is  the  asymptote  to  G+  through  p, 
then  so  is  A~  to  G~.  The  uniqueness  of  the  non-intersecting  line  implies 
symmetry  and  transitivity. 

The  usual  formulation  of  the  corresponding  hyperbolic  axiom  (if  p  $  G 
and  A+  is  the  asymptote  to  G+  through  p,  then  A~  is  not  an  asymptote  to  G~) 
does  not  imply  symmetry.  The  intersections  of  the  curves  in  the  system  TV 
just  constructed  with  the  domain  x  <  0,  y  >  —  x  provide  an  example. 
This  corresponds  to  the  fact  that  in  the  foundations  of  hyperbolic  geo- 
metry symmetry  and  transitivity  of  the  asymptote  relation  are  proved 
with  the  help  of  the  congruence  axioms. 

Other  questions  concern  the  distances  from  A+  to  G+.  In  any  straight 
space  the  existence  of  points  xveA  +  and  yv  e  G+  which  tend  on  A  + 
and  G+  in  the  positive  direction  to  oo  and  for  which  xvyv  ->  0  is  sufficient 
for  A+  and  G+  to  be  asymptotes  to  each  other.  But  boundedness  of  xG, 
when  x  traverses  a  positive  subray  of  A  '•  is  neither  necessary  nor  sufficient 
for  A+  to  be  an  asymptote  to  at  least  one  orientation  of  G.  In  fact,  very 
surprising  phenomena  occur  even  for  the  ordinary  lines  as  geodesies. 

Let  g(t)  be  defined  and  continuous  for  t  >  0,  g(0)  —  0,  g(/i)  <  g(t%)  for 
t\  <  t%  and  g(t)  ->  oo  for  /  ->  oo.  Put 


f(x,  a)  =  signal  cos  a  +  #2  sin  <x)g(\x\  cos  a  +  x%  sin  a|). 
Then  the  arguments  of  [3,  Section  11]  show  readily  that 

(3)    Pg(xty)  =7|/(*,a)-/(y,a)|rfa 

-7T/2 

is  a  metrization  of  the  (x\,  #2)  -plane  as  a  G-space  with  the  lines  axi  + 
bxz  +  c  —  0  as  geodesies.  pg(x,  y)  is  invariant  under  x\=xi  cos  a—  x%  sin  a 
x2'  —  xi  sin  a  +  X2  cos  a,  so  that  the  metric  even  possesses  the  rotations 
about  (0,  0).  Nevertheless,  simple  estimates  show  that  for  g(t)  —  log(l  -\-t) 
any  two  parallel  lines  G\t  G%  have  the  property  that  pg(x,  G%)  —  >  0  when  x 
traverses  GI  in  either  direction.  For  g(t)  —  e*  —  1,  we  have  p(xt  GZ)^>OO. 

In  straight  spaces  which  satisfy  the  condition  (a)  for  nonpositive 
curvature,  the  boundedness  of  xG  when  x  traverses  a  positive  subray  of 
A+  is  necessary  and  sufficient  for  A  +  to  be  an  asymptote  to  a  suitable 
orientation  of  G,  so  that  the  asymptote  relation  is  transitive. 

For  the  foundations  of  geometry  it  is  of  greater  interest  to  see  how 
mobility  eliminates  the  various  abnormal  occurrences.  Assume  a  plane  P 
which  is  metrized  as  a  straight  space  possesses  a  motion  a  which,  reduced 
to  the  straight  line  G,  is  a  proper  translation  of  G  (z(t)<x,  =  z(t  +  a),  a  =£  0). 


AXIOMS  FOR  GEODESICS  AND  THEIR  IMPLICATIONS  155 

Then  the  asymptote  relation  is  transitive  within  the  families  of  asymptotes 
to  G+  and  G~  (see  [3,  Section  32]).  The  parallels  to  G  are  exactly  those 
lines  which  go  into  themselves  under  a.  They  either  cover  all  of  P,  or 
a  closed  halfplane,  or  a  closed  strip  which  may  reduce  to  G.  If  p  does  not 
lie  on  a  parallel  to  G,  let  GI  be  the  parallel  to  G  (possibly  G  itself)  closest 
to  p.  If  x  traverses  an  asymptote  A+  to  GI+  (or  G+)  in  the  positive  di- 
rection, then  xGi  -^0;  if  x  traverses  A  +  in  the  negative  direction,  then 


A  characterization  of  the  higher  dimensional  euclidean  geometry,  [3, 

Theorem  (24.10)].  In  the  foundations  of  geometry  parallelism  for  lines  in 
space  reduces  to  that  in  a  plane,  because  only  spaces  are  considered  in 
which  any  three  points  lie  in  a  plane.  In  straight  spaces  of  higher  dimen- 
sion than  two  we  mean  by  the  parallel  axiom  the  following  two  require- 
ments : 

The  asymptote  relation  is  symmetric.  If  A  +  is  an  asymptote  to  G+  then  so 
is  A~  to  G~. 

The  metrics  pg(x,  y)  which  can  be  extended  to  higher  dimensions,  show 
that  this  parallel  axiom  and  the  Theorem  of  Desargues  or  the  existence 
of  planes  do  not  imply  that  the  space  is  Minkowskian  (finite  dimensional 
linear).  Without  any  postulates  regarding  the  existence  of  planes  the 
higher  dimensional  euclidean  geometry  is  characterized  by  the  above  parallel 
axiom  together  with  the  existence  and  symmetry  of  perpendicularity  in  this 
sense  : 

(4)  //  p  $  G,  f  G  G,  and  pf  —  min  px,  then  for  every  x  E  G  also  xf  —  min  xy. 

a-etf  xeG(P.f) 

It  is  well-known,  see  [3,  p.  104],  that  there  are  Minkowski  planes  which 
are  not  euclidean  and  satisfy  (4).  Also,  (4)  is  without  the  parallel  axiom  a 
weak  condition;  it  is,  for  example,  satisfied  by  every  simply  connected 
Riemann  space  with  non-positive  curvature.  This  follows  from  [3,  Theo- 
rems (20.9)  and  (36.7)]  and  the  symmetry  of  perpendicularity  in  Riemann 
spaces. 

Similarities  and  differentiability,  [5].  It  is  natural  to  ask  how  we  can 
recognize  from  the  behaviour  of  our  finite  distances  xy  whether  a  G-space 
possesses  differentiability  properties  in  terms  of  suitable  coordinates.  The 
best  guide  in  a  statement  which  is  formulable  and  false  without,  but 
correct  with  differentiability  hypotheses. 


156  HERBERT    BUSEMANN 

A  proper  similarity  of  a  G-space  R  is  a  mapping  a  of  R  on  itself  such 
that  xocyx  =  kxy  for  all  x,  y  E  R  where  k  is  a  positive  constant  different 
from  1 .  A  proper  similarity  has  exactly  one  fixed  point  /.  Because  or1  is 
also  a  similarity  and  its  factor  is  k~l,  we  may  assume  that  k  <  1 .  Then 
xa.vyoLv  =  kvxy  and  x<x?xoiv+l=kvxxai  show  that  xocv  is  a  Cauchy  sequence 
with  a  limit  /  and  also  that  yot,v  ->  /.  It  follows  readily  that  the  space  is 
straight. 

Linear  spaces  obviously  possess  similarities  with  arbitrary  factors  k, 
but  this  does  not  characterize  them  among  all  G-spaces  without  differ- 
entiability hypotheses.  For  if  in  (3)  we  choose  g(t)  —  $,  ft  >  0,  then 
pg(dx,  dy)  =  $p(x,  y),  where  dx  —  (dxi,  dx%),  so  that  for  6&  —  k  the  mapp- 
ing x  ->  dx  is  a  similarity  with  the  factor  k',  yet  the  space  is  linear  only 
for/j-  1. 

Differentiability  always  means  a  locally  nearly  linear  behaviour. 
We  say  that  a  G-space  is  continuously  differentiate  atp\i  for  any  sequence 
of  triples  of  distinct  points  av,  bv,  cv  which  tend  to  p,  and  for  any  points 
bv',  cv'  with  (avbv'bv),  (avcv'cv]  and  avbv'  :  avbv  =  avcv'  :  avcv  =  tv,  we  have 

lim  bv'cv'ltvbvcv  =  1 . 

V-+00 

(Differentiability  would  correspond  to  the  special  case  av  =  p  and  proves 
insufficient  as  our  example  shows.)  A  G-space  is  Minkowskian  if  it  pos- 
sesses one  proper  similarity  a  and  is  continuously  differentiate  at  the 
fixed  point  of  a. 

This  form  of  differentiability  is  adequate  also  for  the  problem  originally 
posed:  Let  a  G-space  7^  be  continuously  differentiable  at  p.  Put  pt  —  p, 
and  for  q  ^  p,  let  (pqtq)  and  pqt  :  pq  =  t.  Then  for  x,  y  e  S(p,  pp),  the 
limit 

mp(x,  y)  =  lim  xtytlt 
<->o  i- 

exists.  Obviously  mp(p,  x)  =  px.  The  metric  mp(x,  y)  can  be  extended  so 
as  to  yield  a  space  satisfying  I,  II,  III,  and  IV  in  the  strong  form  that  a 
point  z  with  (xyz)  exists  for  any  x  --£  y.  In  general  V  will  not  hold  for  this 
"tangential  metric"  at  p.  If  V  does  hold  we  say  —  following  the  termi- 
nology of  the  calculus  of  variations  —  that  the  space  is  regular  at  p. 

If  the  space  is  continuously  differentiable  and  regular  at  p,  then  the 
metric  mp(x,  y)  is  Minkowskian  and  xyl,mp(x)  y)  ->  1 ,  for  x  ^=  y  and 

x  ->  P>  y  -+  P- 


AXIOMS  FOR  GEODESICS  AND  THEIR  IMPLICATIONS  157 

If  the  space  R  is  a  Finsler  space  with  ds  —  f(x,  dx)  as  above  and  R  is  of 
class  Cm,  m>4,  and  /  is  of  class  Cm~l  for  dx  ^  0,  then  R  will,  in  S(p,  pp)  —p, 
be  at  least  of  class  Cw~2,  and  /  of  class  Cm~3  in  affine  coordinates 
belonging  to  mp  (normal  coordinates).  Thus  we  obtain  a  complete  de- 
cision of  the  problem  whether  a  G-space  is  a  Finsler  space  of  class  C°° 
and  a  partial  solution  for  finite  m. 

Two  dimensional  Riemann  spaces,  [1].  The  great  variety  of  metrics 
satisfying  the  Axioms  I  to  V  indicates  that  relaxing  these  axioms  es- 
sentially without  adding  others  leads  to  spaces  with  too  little  structure  for 
a  significant  theory.  On  the  other  hand  there  are  surfaces  in  £3  like  poly- 
hedra  and  general  convex  surfaces  which  do  not  satisfy  our  axioms, 
hence  still  less  the  assumptions  of  classical  differential  geometry,  but  are 
geometrically  most  interesting. 

It  is  the  purpose  of  A.  D.  Alexandrov's  theory  to  define  and  study 
(intrinsically  and  extrinsically)  a  class  of  surfaces  narrow  enough  to 
encompass  deep  results  and  yet  wide  enough  to  include,  for  example,  the 
mentioned  surfaces.  He  assumes  that  the  space  R  be  a  two-dimensional 
manifold,  metrizcd  such  that  the  distance  of  any  two  points  x,  y  equals 
the  greatest  lower  bound  of  the  lengths  of  all  curves  from  x  to  y.  (If  II 
holds,  this  implies  III.)  The  problem  is  how  to  introduce  the  Riemannian 
character  of  the  metric  without  differentiability.  Alexandrov's  principal  tool 
is  the  (upper]  angle  <x.(T,  T'}  between  two  segments  T,  T'  with  the  same 
origin  z:  If  x(t),  y(t),  t  >  0,  #(0)  ==  y(0)  =  zt  represents  T  and  T',  then 

,~  ™     r  &  +  *2  - 

a(7  ,  i)  =  lirn  sup  arc  cos  —    ---- 


where  0<  arc  cos  <  n.  For  a  geodesic  triangle  D  with  sides  T,  T',  T" 
we  define  the  excess  as 

e(D)  =  «(r,  r)  +  «(r,  r  ')  +  a(r",  r)  -  n. 

The  Riemannian  character  of  the  metric  enters  through  the  require- 
ment that  for  every  compact  subset  M  of  7?  a  number  fl(M)  exists  such 
that  for  any  finite  set  of  non-overlapping  triangles  DI,  .  .  .  ,  Dm,  in  M 

(5)  2>(/>i)|<«Af). 

t       1 

According  to  Zalgaller  [17]  it  suffices  to  require  2  e(^)  <  P(M),  in 
other  words,  the  triangles  with  negative  excess  never  cause  any  trouble. 


158  HERBERT    BUSEMANN 

That  the  condition  (5)  is  really  essentially  Riemannian  follows  from  the 
fact  that  a  Minkowski  plane  does  not  satisfy  it  unless  it  is  euclidean. 

To  these  so-called  surfaces  with  bounded  curvature  Alexandrov  extended 
some  of  the  deepest  theorems  of  differential  geometry;  we  mention  only 
Weyl's  problem :  Let  the  sphere  or  the  plane  be  metrized  such  that  7,  //,  /// 
and  (5)  hold.  Assume  moreover  that  the  excess  e(D)  >  0  for  all  small  geo- 
desic triangles.  Then  the  metric  can  be  realized  by  a  closed  or  by  a  complete 
open  convex  surface  in  £3.  Actually  the  open  surfaces  were  not  covered  by 
the  classical  methods,  and  the  results  of  Alexandrov  on  the  deformation  of 
convex  surfaces  surpass  by  far  anything  obtainable  by  the  traditional 
approach. 

The  theorem  of  Nash-Kuiper  on  the  CMmbeddability  in  E3  of  given 
abstract  two-dimensional  Riemannian  manifolds  in  the  classical  sense 
stresses  the  significance  of  these  results,  because  it  shows  that  a  reasonable 
and  general  class  of  surfaces  in  E3  cannot  be  defined  in  terms  of  differ- 
entiability conditions  only. 


Bibliography 

[1J     ALEXANDROW,  A.  D.,  Die  innere  Geometrie  der  konuexen  Fldchen,  Berlin  1955, 

XVII  +  522pp. 
[2J     BOCHNER,   S.,    Vector  fields  and   Ricci  curvature,    Bulletin   of  the  American 

Mathematical  Society,  vol.  52  (1946)  pp.  776-797. 

[3]     BUSEMANN,  II.,  The  geometry  of  Geodesies.  New  York  1955,  X  -f  422  pp. 
[4]    ,  Metrications  of  projective  spaces.  Proceedings  of  the  American  Mathe- 
matical Society,  vol.  8  (1957)  pp.  387-390. 
[5] ,  Similarities  and  differentiability.  Tohoku  Mathematical  Journal,  Sec, 

Ser.,  vol.  9  (1957),  pp.  56-67. 

L6j     f  Convex  Surfaces.  New  York  1958,  VII  -1    194  pp. 

[7]    1  Spaces  with  finite  groups  of  motions.  Journal  de  Math6matiques  pures 

et  appliqu6es,  9th  Ser.,  vol.  37  (1958)  pp.  365-373. 
(8]     HADAMAKD,  J.,  Les  surfaces  a  courbures  opposes  et  leur  lignes  geoddsiques. 

Journal  de  Mathematiques  pures  et  appliquees,  5th.  Ser.  vol.  4  (1898),  pp. 

27-73. 
[9]     HILBKRT,  D.,  Grundlagen  der  Geometrie.  8th.  eel.,  Stuttgart  1956,  VII  +  251 

pp. 
[10]    HOPF,  E.,  Closed  surfaces  without  conjugate  points.  Proceedings  of  the  National 

Academy  of  Sciences,  vol.  34  (1948)  pp.  47-51. 
[1 1]    KELLY,  P.  J.,  and  STRAUS,  E.  G.,  Curvature  in  Hilbert geometry.  Pacific  Journal 

of  Mathematics,  vol.  8  (1958)  pp.  119-126. 


AXIOMS  FOR  GEODESICS  AND  THEIR  IMPLICATIONS  159 

[12]     PKDERSKN,  F.  P.,  On  spaces  with  negative  curvature.  Matematisk  Tidesskrift  B 

1952,  pp.  66-89. 
[13]     SKORNYAKOV,  L.  A.,  Metrication  of  the  protective  plane  in  connection  with  a 

given  system  of  curves  (Russian).  Izvestiya  Akademii  Nauk  SSSR,  Scriya  Mate- 

matiCeskaya  19  (1955)  pp.  471-482. 
[14]    TITS,  J.,  Sur  certaines  classes  d'espaces  homogenes  de  gronpes  de  Lie.  Academic 

royalc  de   Rclgique,   Classe  clcs  sciences,   Memoirs,  Collection  in-8°,  vol.   39 

fasc.  3  (1955),  268  pp. 
[15]    WAT.D,   A.,  Begrundung  einer  koordinatenlosen  Differentialgeometrie  der  Fid- 

chen.  Ergebnisse  eines  mathematischeii   Kolloquiums  (Wien)   Heft  7  (1936), 

pp.  24-46. 
[16]     WANG,  H.  C.,  Two-point  homogeneous  spaces.  Annals  of  Mathematics,  vol.  55 

(1952),  pp.  177-191. 
[17]     ZALGALI  ER,  V.  A.,  On  the  foundations  of  the  theory  of  two-dimensional  manifolds 

with  bounded  curvature  (Russian).  Doklady  Akademii  Nauk  SSSR,  vol.   108 

(1956),  pp.  575-576. 


Symposium  on  the  Axiomatic  Method 


AXIOMS  FOR  INTUITIONISTIC  PLANE  AFFINE  GEOMETRY 

A.  HEYTING 

University  of  Amsterdam,  Amsterdam,  Netherlands 

1.  Introduction.  At  first  sight  it  may  appear  that  the  axiomatic  method 
cannot  be  used  in  intuitionistic  mathematics,  because  there  are  only 
considered  mathematical  objects  which  have  been  constructed,  so  that  it 
makes  no  sense  to  derive  consequences  from  hypotheses  which  are  not  yet 
realized.  Yet  the  inspection  of  the  methods  which  are  actually  used  in 
intuitionistic  mathematics,  shows  us  that  they  are  for  an  important  part 
axiomatic  in  nature,  though  the  significance  of  the  axiomatic  method  is 
perhaps  somewhat  different  from  that  which  it  has  in  classical  mathe- 
matics. 

In  principle  every  theorem  can  be  expressed  in  the  form  of  an  axio- 
matic theory.  Instead  of  "Every  natural  number  is  a  product  of  prime 
numbers"  we  can  write  "Axiom,  n  is  a  natural  number.  Theorem,  n  is 
a  product  of  prime  numbers.".  This  way  of  presentation  becomes  prac- 
ticable whenever  a  great  number  of  theorems  contains  the  same  compli- 
cated set  of  hypotheses.  Thus,  conversely,  every  axiomatic  theory  can  be 
read  as  one  general  theorem  of  the  form :  "Whenever  we  have  constructed 
a  mathematical  object  M  satisfying  the  axioms  A,  we  can  affirm  about  M 
the  theorems  T." 

Of  course  the  content  of  the  theory  will  be  influenced  by  the  intui- 
tionistic point  of  view;  in  particular,  questions  of  effective  constructibility 
will  be  of  main  importance.  In  order  to  give  an  idea  of  these  differences  I 
shall  show  the  method  at  work  in  an  example,  which  I  have  so  chosen 
that  the  problem  is  trivial  in  classical  mathematics,  so  that  the  intui- 
tionistic difficulties  appear,  so  to  say,  in  their  purest  form. 

2.  The  problem.  In  [1J  and  [2]  I  gave  a  system  of  axioms  for  intuition- 
istic plane  projective  geometry.  Here  I  wish  to  give  a  system  of  axioms  for 
plane  affine  geometry  which  is  satisfied  by  the  intuitionistic  analytical 
geometry,  and  which  allows  us  to  construct,  by  a  suitable  extension  of  the 
plane,  a  projective  plane  which  satisfies  the  axioms  of  [1]  and  [2].  This 
problem  is  easy  in  the  case  of  desarguesian  geometry,  because  then  the 

160 


AXIOMS  FOR  INTUITIONISTIC  PLANE  AFFINE  GEOMETRY     161 

extension  can  be  effected  by  means  of  harmonic  conjugates.  If  only  the 
trivial  axioms  of  incidence  are  assumed,  the  problem  is  still  easy  from  the 
classical  point  of  view,  but  it  presents  serious  difficulties  in  intuitionistic 
mathematics.  These  difficulties  are  caused  by  the  fact  that  not  only 
points  at  infinity  must  be  adjoined  to  the  affine  plane,  but  also  points  for 
which  it  is  unknown  whether  they  are  at  infinity  or  not. 

3.  The  axiom  system  for  project! ve  geometry. 

FUNDAMENTAL  NOTIONS  : 

Two  disjoint  sets  $  and  8;  the  elements  of  *JJ  are  called  points',  those 
of  £  lines. 

A  relation  #,  whose  domain  and  range  are  ^5;  this  relation  is  called 
apartness. 

A  relation  e,  whose  domain  is  $  and  whose  range  is  2 ;  this  relation  is 
called  incidence. 

Notation'.  Capitals  in  italics  denote  points;  lower  case  italics  denote 
lines. 

Free  use  is  made  of  such  expressions  as  "a  line  through  a  point";  the 
translation  into  the  incidence  -  language  is  left  to  the  reader.  Also,  line  / 
is  sometimes  identified  with  the  set  of  points  incident  with  /,  without 
further  explanation.  It  would  be  easy  to  avoid  such  identifications  by  a 
somewhat  clumsier  presentation.  In  particular  the  notation  I  o  m  is  used 
for  the  set  of  points,  which  are  incident  with  /  as  well  as  with  m. 

Logical  signs  are  used  as  abbreviations.  They  must  be  understood  in 
the  intuitionistic  sense  (see  [3]  and  [4J  or  [5]). 

->•  stands  for  implication,  &  for  conjunction,  V  for  disjunction,  -i  for 

negation, 

(V#)  is  the  universal  quantifier  (for  every  x), 

(3x)  is  the  existential  quantifier  (there  exists  an  x  such  that). 

AXIOMS  FOR  APARTNESS: 
SI    A  #B  -*B  #A. 
$2    -*A  #  B  +->  A  =  B. 
S3    A  #  B  -*  (VC)(C  #  A  v  C  #  B). 


162  A.  HEYTING 

GEOMETRICAL  AXIOMS  : 

PI    A  #B-+(3l)(A  El&BEl) 

P2    A#B&Aelnm&Belnm~+l  =  m. 

DEFINITION  1 .     A  lies  OUTSIDE  /  (A  a>  1)  if  (VB)(B  E  I  ->  B  #  A). 
DEFINITION  2.     /  lies  APART  FROM  m  (I  #  m),  if  (3A)(A  e  /  &  A  co  m). 

P3    /  #  m  -+(3A)(A  Elr^n). 

P4    A#B&Ael&BEl&Ctol&Aem&Cem-+Bo>m. 

P5       (i)  There  exist  two  points,  A  and  B,  so  that  A  #  B\ 

(ij)  Every  line  contains  at  least  three  points  A,  B,  C,  so  that 
A  #B,A  #Candfl  #  C; 

(iij)  When  /  is  a  line,  a  point  outside  /  can  be  found. 

DEFINITION  3.  //  A  #  B,  then  the  line  I  satisfying  A  G  I  &  B  e  /  is 
denoted  by  AB. 

It  can  be  proved  from  these  axioms,  that  the  relation  #  between  lines 
(Definition  2)  is  an  apartness  relation;  this  means  that  it  satisfies  axioms 
SI -S3  for  lines  instead  of  points. 

4.  The  axiom  system  for  affine  geometry. 

FUNDAMENTAL  NOTIONS:  ^5,  y,  #,  E,  as  in  §  3. 
AXIOMS  FOR  APARTNESS:  S1-S3,  as  in  §  3. 
DEFINITIONS:  1  and  2  as  in  §  3. 
GEOMETRICAL  AXIOMS  : 

Al    I  #m,Aa>l-»  (3p)(A  e  p  &  /  r»  p  =  I  n  m). 

A2   A#B&Aelnm&Belntn-+l  =  m. 

DEFINITION  4.     /  INTERSECTS  m  if  I  #  m  &  (3  A)  (A  elnm). 
A3    /  intersects  m  ->  (Vp)((3A)(A  el^p)y  (3B)(B 
A4    A#B&AEl&BEl&Ca)l&A£m&CEm 
A5   P 


AXIOMS  FOR  INTUITIONISTIC  PLANE  AFFINE  GEOMETRY      163 

DEFINITION  5.     /  is  PARALLEL  to  m  (I  //  m)  if  (VA)(A  el  -^  A  com). 
Remark:  A5  can  now  be  formulated  as  follows:  lf^m—0&m #/->// /m. 

A6    (VQ(3w)(///w). 

A7      (i)  There  exists  at  least  one  line ; 

(ii)  Every  line  is  incident  with  at  least  four  points  every  two  of 
which  are  apart  from  each  other ; 

(iii)  A  #  B  ->  (3l)(A  el&Bojl). 
(iv)  Ael->  (3m) (A  e  w  &  /  #  m). 

Remarks:  (1)  If  I  and  m  have  a  common  point  B,  Al  asserts  the  ex- 
istence of  a  line  joining  A  and  B  (see  Th.  la).  On  the  other  hand,  if  /  and  m 
have  no  common  point,  it  follows  from  Al  that  there  is  a  line  through  A 
which  does  not  intersect  /;  this  is  part  of  the  assertion  of  existence  of 
parallels.  Moreover,  Al  admits  an  assertion  in  the  case  that  it  is  unknown 
whether  /  intersects  m.  (2)  A2  =  P2.  (3)  A3  is  a  strong  form  of  the  unique- 
ness assertion  for  parallels.  (4)  A4  is  called  the  triangle  axiom. 

5.  Elementary  theorems. 

THEOREM  1 .     If  I  #  m,  A  e  I  r\  m,  B  e  /  r»  mt  then  A  =  B. 

PROOF:  Suppose  A  #  B;  then,  by  A2,  /  =  m,  which  contradicts 
/  #  m.  So  A  --^  B. 

Remark:  In  the  case  of  Th.  1  we  write  /  r\  m  =  A. 
THEOREM  la.     A  #  B  ->  (31)  (A  el  &  Bel). 

PROOF:  By  A7(iii)  there  is  a  line  p  so  that  A  e  p  &  B  CD  p. 
By  A7(iv)  there  is  a  line  m  so  that  A  em  &  p  #  m. 
By  Th.  1 ,  p  r»  m  =  A .  By  Al  there  is  a  line  /  so  that  B  E  I  and  p  n  1  = 
p  r\  m  =  A  ;  it  follows  that  A  e  I. 

THEOREM  2.  The  relation  #  between  lines  is  an  apartness  relation,  i.e. 
it  possesses  the  properties  (i)-(iii) : 

(i)  /  #  m  ->  m  #  /. 

(ii)  -i/  #  m  <-»  I  =  m. 
(iii)  /  #  m  -+  (\Tp)  (p  #  I  v  p  #  m). 
LEMMA  2. 1 .     (iii)  holds  if  I  intersects  m  in  S  and  Sep. 


164  A.    HEYTING 

PROOF:  Choose  points  A,  B,  C  so  that  A  el,  A  com  (Dei.  2);  B  e m, 
B  #S\  Cem,  C  #  S,  C  #  B.  This  is  possible  by  A7  (ii),  S3.  By  A4, 
B  co  AC,  so  AB  #  AC.  By  A3,  p  has  a  point  in  common  with  AB  or  with 
AC i  say  D  e  p  o  AB.  D  #  A  v  D  #  B.  If  D  #  A,  then  D  co  I  (A4),  so 
p  #l;iiD  #  B,  then  D  eo  w  (A4),  so  p  #  m. 

PROOF  OF  (i) :  Choose  A,  Bt  C  so  that  Ael,Acom,Bem,Cem,B#C. 
AB  #  4C  (A4).  I  #ABv  I  #AC  (Lemma  2.1).  If  /  #  45,  we  choose 
Z)  on  /  so  that  D  co  AB;  then  B  co  I  (A4),  so  m  #  1.  Similarly,  if  /  #  4C. 

LEMMA  2.2.     -i  P  eo  /  ->  P  e  /. 

PROOF :  Choose  C  and  w  so  that  C  co  I,  C  em,  P  co  m ;  then  I  #  m. 
Choose  A  so  that  A  em,  A  co  I  (Th.  2  (i)).  P4  #  w.  By  A3,  /  has  a  point  in 
common  with  PA  or  with  m.  If  /  intersects  m  in  S,  we  choose  B  so  that 
Bem,Bco  PA .  PA  #  PB ;  I  has  a  point  in  common  with  PA  or  with  PB. 
Say  Q  el  r\  PA .  Now  suppose  P  #  Q ;  then  P  CD  £,  which  contradicts  the 
hypothesis,  so  -,  P  #  <?,  so  P=  (?  (S2),  so  P  e  J. 

PROOF  OF  (ii) :  -^l  #  m  -+ 1  =  m  is  an  immediate  consequence  of 
Lemma  2.2,  while  /  =  w->-i/#wisa  consequence  of  S2. 

PROOF  OF  (iii) :  Choose  P,  0,  #,  5  so  that  P  E  I,  P  com;  Q,  R,  S  em; 
Q  #  R  #  S  #  Q.  As  in  the  proof  of  (i)  it  can  be  shown  that  at  least  two 
of  the  points  Q,  R,  S  are  outside  /;  say  Q,  R  co  1.  PQ  #  PR,  so  p  has  a 
point  in  common  with  PQ  or  with  PR;  say  A  e  p  ^  PQ.  A  #  P  v  A  #  Q. 
Ii  A  #  P,  then  A  co  I,  so  p  #  L  If  A  #  Q,  then  A  co  m,  so  p  #  m. 

THEOREM  3.     P  co  I  -*  (3m)(P  em&m  // 1). 

PROOF:  Draw  a  line  £  //  /  (A6).  By  Al,  there  is  a  line  m  through  P  so 
that  l^m  =  l^p  =  0.By  A5,m  I/ 1. 

THEOREM  4.    l//m&l//n&m#n-+ml/n. 

PROOF:  Choose  P  and  Q  so  that  P  en,  P  com,  Q  em.  PQ  #  m.  By  A3, 
/  has  a  point  vS  in  common  with  PQ.  S  co  n ;  by  A4,  Q  co  n.  As  Q  is  an  arbi- 
trary point  on  m,  we  have  m  //  n. 

6.  Projective  points. 

DEFINITION  6.     I  #  m  ->  $(/,  w)  =  {#|/  r\m  =  In  xv  I  r\m  =mn*x}. 
If  I  #  m,  then  ty(l,  m)  is  a  PROJECTIVE  POINT  (abbreviation:  p.  point). 


AXIOMS  FOR  INTUITIONISTIC  PLANE  AFFINE  GEOMETRY     165 

Remarks:  Where  y$(l,  m)  occurs,  it  is  understood  that  I  #  m.  German 
capitals  will  be  used  to  denote  projective  points.  The  next  theorem  shows 
that  the  notion  of  a  projective  point  is  an  extension  of  that  of  a  line  pencil 
in  the  usual  sense. 

THEOREM  5.  //  /  intersects  m  in  S,  then  ty(l,  m)  is  the  class  of  all  lines 
through  S.  If  I  II  m,  then  $(/,  m)  =  [x\x  //  1  v  x  //  m}. 

PROOF:  The  first  part  of  the  theorem  is  obvious.  If  l//m,  and 
n  e  ?p(/,  m),  then  /  r*  n  —  0  or  m  r\  n  =  0;  also  /  #  n  or  m  #  n.  The  only 
case  which  needs  to  be  further  considered  islnn~0&m#n.  Suppose 
5  G  m  ^  n  ;  then  by  A3,  /  has  a  point  in  common  with  m,  in  contradiction 
with  I  If  m.  Thus  m  n  n  =  0,  so  m  //  n.  Conversely,  if  I  //  m  and  n  //  /, 
then  /  r\  m  —  I  r\  n,  so  w  G  $(/,  m). 

THEOREM  6.     p,  q  e  $(/,  m)  &  £  #  #  ->  $(/,  w)  =  $(£,  ?). 
LEMMA  6.1.     /  #  m  &l  #n&lr\m  =  mnn-+lr\m  =  lnn. 

PROOF  :  It  follows  from  the  hypothesis  that  I  r\  mCl  r\  n.  Suppose 
P  E  I  r»  n  ;  then  /  intersects  n,  so  m  has  a  point  in  common  with  /  or  with 
n  (A3). 

Case  1  .  m  intersects  /.  As  /  o  n  =  P  and  I  n  mC  I  nn,we  have  I  ^m  = 
P  =  l~n, 

Case  2.  Qem^n.  Then  Qelr\m,  so  Qel^n;  it  follows  that 
Q  =^  P  and  that  I  r\nQl  r\m. 

COROLLARY:  In  the  case  that  /  #  n  we  have  w  e  ^$(/,  w)W  r\m  =1  ^  n. 
LEMMA  6.2.     I  #m  #n  #l&ne  $(/,  m)  ->  $(/,  w)  =  $(/,  n). 

PROOF  :  By  hypothesis  and  lemma  6.  1  we  have  lnm  =  lnn  =  m^n. 
Suppose  p  e  $(/,  w).  />  #  I  or  p  #  m&p  #  n. 

Case  1  .     p  #  1.  Then  I  n  m  =  I  n  p  (lemma  6.  1),  so  Z  <^  w  =  /  n  />,  so 


2.  p  #  m&p  #  n.  Now  lr\m  =  mr\p.  X  G/  on  ->  X  e/  ^m 
->  X  em  ^  p,  so  X  en  r\  p.  It  follows  that  I  r^nCn  r\  p.  Now  suppose 
Y  en  r\p.  By  A3,  we  may  distinguish  subcases  2a:  (tn  intersects  w)  and 
2b:  (w  intersects  />). 

Case  2a.  m  intersects  win  Z.  mr\n  —  lr\m  =  mr\p,  so  Z  en  n  p. 
It  follows  that  Z  =  Y,  and  that  Y  em^n,  so  Y  e  I  r\  n. 

Case  2b.  m  intersects  pmZ.Zemnp  —  lnm  =  lr\n,soZennp. 
It  follows  that  Y  =  Z  and  that  Yemr\p  =  lr\n. 

In  case  2a  as  well  as  in  case  2b  we  have  proved  that  n  r>  p  C  I  ^  n;  thus 


166  A.    HEYTING 

in  case  2,   l^n  =  nnp,  that  is  p  e  *$(l,  n).   We  have  proved  that 
$(/,  w)C*J}(/,  n).  In  particular,  m  e  ^(/,  n)  ;  the  same  proof  then  gives  us 


LEMMA  6.3.  :  I  #m&l  #n&ne  $(/,  m)  ->  $(/  ,m)  =  $(/,  n). 

PROOF  :  Choose  A  and  Z^  so  that  A  E  I,  A  co  m,  13  em,  13  CD  I. 
By  A3,  n  intersects  /  v  (3C)(C  e  n  n  /IB). 

If  n  intersects  /  in  P,  I  *m  =  Inn  =  P,  so  s$(/,  w)  =  s£(/,  n). 
If  C  e  w  n  ^Z?,  we  have  n  #  mv  n  #  AB. 
If  n  #  w,  we  can  apply  Lemma  6.2. 

If  «  #  4/?,  we  choose/)  so  that!)  #  ,4,  13,  C  andDeAB.  By  Al,  there 
is  a  line  p  so  that  D  eft  &  p  e  s$(/,  m).  Now  by  Lemma  6.2,  *£(/,  m)  --= 

w.  0  =  w  »)• 

PROOF  OF  THEOREM  6:  We  may  suppose  that  q  #  I. 

Choose  n  in  s$(/,  m)  so  that  n  #  I,  q  (see  the  proof  of  lemma  6.3).  By 
Lemma  6.3  we  have  $(/,  w)  =  $(/,  w)  =  %(n,  q)  =  ty(p,  q). 

DEFINITION  7.  //  /  intersects  m  in  S,  then  ty(l,  m)  is  a  PROPER  p.  point 
and  we  write  ty(l,  m)  —  S.  If  I  //  m,  then  ^5(/,  m)  is  an  IMPROPER  p.  point. 

Remark:  It  is  by  no  means  true  that  every  p.  point  is  either  proper  or 
improper.  However,  as  an  immediate  consequence  of  A5,  a  p.  point  that 
cannot  be  proper,  is  improper. 


DEFINITION  8.    /  lies  OUTSIDE  51  (/  co  91)  if  ^ip)(p  eW-+p  #1). 
DEFINITION  9.     «  lies  APART  from  93  («  #  8)  if  (3p)(p  e  91  &  p  co  »). 

Remark.  If  91  is  a  proper  p.  point  A  ,  then  /  CD  91  is  equivalent  with  ^4  co  I. 
This  is  easily  proved  by  means  of  Axiom  A4.  It  follows  that,  for  proper  p. 
points  91  =  A  and  93  =  B,  51  #  93  is  equivalent  to  A  #  £.  Axiom  Al 
can  now  be  read  as  follows: 

//  A  is  a  proper  p.  point  and  93  a  p.  point  so  that  A  #  8,  then  there  exists 
a  line  I  so  that  A  e  /  and  I  e  S.  This  line  will  be  denoted  by  A9$.  It  follows 
from  Th.  8  below  that  it  is  unique. 

THEOREM  7.  The  relation  #  between  protective  points  is  an  apartness 
relation',  that  is,  it  possesses  the  properties  (\\,  (ii),  (iii): 

(i)  91  #<B->$(  #93. 
(ii)  -,  91  #  93  <->  51  =  8. 
(iii)  91  #  93  ->  (V(E)(«  #  (E  v  8  #  g). 


AXIOMS  FOR  INTUITIONISTIC  PLANE  AFFINE  GEOMETRY     167 

LEMMA  7.1.     m  intersects  l&Demnp&Dcol    >  p  intersects  I  v  p  #m 

PROOF:  There  is  a  line  m'  through  D  so  that  m'  //  /  (Th.  3);  m  #  m'. 
p  ^  m  or  p  -/-  m' ;  if  p  -=£•  m',  then  /  intersects  p  (A3). 

LEMMA  7.2 :  C  e  /  &  C  M  m  &  /  #  n  &  /  n  m  —  /  <^  w  ->  C  o>  n. 

PROOF  :  Choose  /I  so  that  A  En  &  A  CD  1.  AC  ~  p.  The  line  w  intersects  / 
v  n  #  p  (Lemma  7.1).  If  n  intersects  /,  then  m  intersects  /,  because 
/  ^  m  —  /  r\  n,  so  C  (o  n  (A4).  If  n  #  p,  then  C  to  n  (A4). 

PROOF  OF  THEOREM  7  (i) :  Choose  /  and  m  so  that  91  =  *£(/,  m)  and  that 
/  a)  S3 ;  choose  C  on  /  so  that  C  co  m.  There  is  a  line  p  so  that  C  e  p,  p  e  93 
(Al)  ;p#l.  Let  »  be  a  line  in  91;  n  #  /  v  n  #  £. 

If  ^  #  /,  then  C  o>  n  (Lemma  7.2),  so  n  #  p. 

We  have  proved  that  p  o)  91,  so  93  #  31. 

PROOF  OF  THEOREM  7(ii):  91  =  ^J5(/,  m).  Let  />  be  a  line  in  93,  then 
-i^>  M  $1.  p  #  I  v  p  #  m ;  suppose  p  #  I.  I  shall  prove  that  I  ^  m  —  I  <^  p. 

Suppose  X  el  r\  m,  so  that  91  consists  of  all  the  lines  through  X.  If  we 
had  X  co  p,  then  p  o>  9(,  which  gives  a  contradiction,  so  X  E  p.  Thus 
/  o  w  C  /  r\  p. 

Suppose  Y  e  I  ^  p.  I  derive  a  contradiction  from  Y  at  m,  as  follows: 
Choose  n  in  91;  n  #  I  v  n  #  p. 

If  n  #  I,  then  Y  w  n  (Lemma  7.2),  so  n  #  p. 

Now  n  #  p  for  every  n  in  91,  so  p  co  91.  This  is  the  desired  contradiction. 
Thus  Y  6  mt  and  I  r\  pCl  r\m. 

PROOF  OF  THEOREM  7(iii):  Choose  /  in  91  so  that  /  co  93;  further  m,  r,  s 
so  that  91  =  $(/,  m),  (£  =  $(rf  s).  I  #  r  v  /  #  s ;  say  /  #  r.  As  in  the  proof 
of  part  (i),  we  find  a  line  p  in  93,  so  that  p  co  91  and  that  ^>  intersects  /  in 
D ;  D  can  so  be  chosen  that  D  co  r  [If  D,  E  e  /  and  7J)  #  E,  then  D  com 
or  E  co  m]  this  is  shown  in  the  proof  of  Lemma  2. 1 .  By  choosing  three 
points  on  /  we  find  at  least  one  point  outside  m  and  outside  r].  Draw  the 
line  i  so  that  Det,te&  (Al).  I  #  t  v  p  #  t.  Suppose  /  #  /.  For  an  arbi- 
trary line  u  in  (£  we  have  u  #  I  v  u  #  t.  If  u  #  t,  then  D  ro  «  (Lemma 
7.2),  so  u  #1.  It  follows  that  /  co  (£,  so  9(  #  (£.  vSimilarly,  ifp#t,  we  have 
33  #C. 

THEOREM  8.     «#8&/e«o»&we«n»->/  =  m. 

PROOF :  Suppose  I  #  m',  then  91  =  /  r>  w  and  93  —  /  ^  w.  Apply  Th. 


168  A.    HEYTING 

7.  Projective  lines. 

DEFINITION  9.     91  #  93  ->  A(9I,  8) 


//  91  #  8,  then  A(9l,  93)  ts  a  PROJECTIVE  line  (£.  «n*)  ;  where  A(9(,  8) 
occurs,  it  is  understood  that  91  #  93. 

Greek  lower  case  letters  will  be  used  to  denote  projective  lines. 

THEOREM  9.     «#-8&/e«n8-»  A(«f  93)  =  {g|/  e  g}. 

PROOF  :  I.  Let  K  be  a  p.  point  such  that  /eg.  g#?lvg#8;  suppose 
£  #  91.    /  e  91  n  93    and    /  e  91  n  g,    so    91  n  93  -  91  n  g    (Th.    8),    so 


II.     Let  $  be  a  p.  point  in  A(«,  93).  9(  *  $  =  91  n  8  or  93  n  $-91  n  93  ; 
in  either  case  /  e  ^5. 

If,  as  in  Th.  9,  A(9l,  93)  contains  a  line  /,  then  it  is  called  a  PROPER 
projective  line;  we  write  in  this  case  A(9l,  93)  =  /. 

THEOREM  10.     //  S  #  8  and  A(9(,  93)  is  a  proper  p.  line  I,  then  either 
91  or  93  is  a  proper  p.  point. 

PROOF:  Choose  m  so  that  m  e  91,  m  a>  93,  and  C  so  that  C  em,  C  co  I. 
C93  =  n.  By  A3,  /  intersects  m  or  w,  so  91  is  proper  or  93  is  proper. 

DEFINITION  10.  A  p.  point  91  lies  OUTSIDE  the  p.  line  A  (91  co  A),  if  91  is 
apart  from  every  p.  point  in  L 

THEOREM  11:  91  co  I  is  equivalent  with  I  co  91. 

LEMMA  1  1.1  :  //  91  #  93,  9T  is  proper  =  4,  ,493  =  /,  n  e  93,  n  #  I,  then 
A  a)  n. 

PROOF:  Choose  P  so  that  Pen,  P  col.  AP  =  p. 

By  Lemma  7.1,  n  intersects  I  or  n  #  p. 

If  w  intersects  /,  then  93  is  proper,  so  A  co  n  by  Axiom  A4. 

If  n  #  p,  then  A  con,  also  by  Axiom  A4. 

LEMMA  1  1.2:  //  B  e  I  and  91  co  I,  then  BW  #  I. 

PROOF:  £9f  =  />;  choose  q  in  9t  so  that  q  #  p. 
By  Lemma  1  1  .  1  ,  B  co  q,  so  /  #  £. 


AXIOMS  FOR  INTUITIONISTIC  PLANE  AFFINE  GEOMETRY     169 

Put  $(/,  q)  =  e.  By  Th.  10,  either  51  is  proper  (51  =  A)  or  (£  is  proper 

(C  =  Q. 

If  91  =  A,  then  A  a>  I,  so  AB  #  I. 

If  £  =  C,  then  C  co  £  (Lemma  1 1.1),  so  /  #  0. 

PROOF  OF  THEOREM  11:  I.  Suppose  51  co  /.  Choose  B  on  I',  B$l  =  p. 
By  Lemma  \\.2,p#l.  Choose  r  in  81;  r  #  /  or  r  #  #. 

lfr#p,  then  #  co  y  (Lemma  1 1 . 1),  so  I  #r. 

We  .have  proved  that  for  every  line  r  in  5(,  I  #  r  is  valid,  so  /  co  51. 

II.  Suppose  /  co  51,  and  let  93  be  any  p.  point  of  /;  then  /  e  93,  so 
51  #  93.  Thus  51  co  I. 

THEOREM  12:  //  /  is  not  apart  from  51,  then  I  e  21. 
PROOF:  This  has  been  shown  in  the  proof  of  Th.  7(ii). 

DEFINITION  1 1 .  Two  p.  lines  A  and  p  are  APART  from  each  other  (A  #  fi) 
if  there  exists  a  p.  point  51  so  that  51  e  A  #nd  51  co  //. 

We  shall  not  prove  directly  that  the  relation  •#.  between  p.  lines  is  an 
apartness  relation;  this  follows  from  the  main  result  of  the  paper,  as  it 
has  been  derived  in  [1,  2]  from  the  axioms  Si -S3,  P1-P5. 

8.  Proof  of  the  projective  axioms.  Our  problem  is  to  prove  that  pro- 
jective  points  and  projective  lines  satisfy  the  axioms  P1-P5  of  projective 
geometry.  I  have  not  succeeded  in  proving  this  from  A1-A7;  I  must 
introduce  some  further  axioms,  which  I  shall  mention  where  I  want  them. 

PI.  If  the  p.  points  51  and  93  are  apart  from  each  other,  then  there  is 
a  p.  line  A  which  contains  51  and  93. 

This  is  an  immediate  consequence  of  Def .  9. 

P2.     If  the  p.  points  51  and  93  are  apart  from  each  other,  and  if  both 
are  contained  in  the  p.  lines  A  and  //,  then  A  =  ju. 
This  will  be  proved  in  Theorem  23. 

P3.  If  the  p.  lines  A  and  //  are  apart  from  each  other,  then  they  have  a 
p.  point  in  common.  This  property,  for  the  case  that  A  or  //  is  proper,  is 
affirmed  in  a  new  axiom  A8. 

A8.     21  #  93  &  /  co  51  ->  (3<£)(/  e  £  &  (£  e  A(5l,  93)). 

The  quantifier  over  projective  points  can  be  avoided.  An  equivalent 


170  A.    HEYTING 

formulation  is: 

A'8:     p#q&r#s&lco  <$(p,  q)  &  r  co  %(p,  q) 
#  /  &  -  %(p,  g)  o  %(r,  s)  = 


In  the  case  that  A(S21,  93)  is  a  proper  p.  line,  A8  follows  from  A1-A7  and 
Def.  7.  A8  suffices  to  prove  P3,  because  it  follows  from  Th.  17  below, 
that  if  the  p.  lines  A  and  /i  are  apart  from  each  other,  then  A  or  p  is  a 
proper  p.  line. 

We  now  turn  to  P4,  the  triangle  axiom.  First  we  consider  the  cases 
where  two  of  the  p.  points  91,  33,  K  are  proper  (Theorems  13,  14,  15). 

THEOREM  13.     If  A  #  B  and  (£  co  AB,  then  A  co  #g. 
PROOF:  AB  co  &  [Th.   1  1],  so  AB  #  B&,  so  A  ay  B&. 

THEOREM  14.     //  A  #  W  and  C  w  A®,  then  A  co  C33. 

PROOF:  AW  =  I,  AC  =  p,  CW  =  n. 

By  Lemma  7.1,  n  intersects  /  or  n  #  p. 

If  n  intersects  /,  then  we  can  directly  apply  A4. 

If  n  #  p,  then  A  <n  n. 

THEOREM  15.     If  A  ^  »  and  C  a>  AW,  then  W  co  AC. 

PROOF:     AC  =  /,  AW  =  m,  CW  —  n\  m  #  n. 

Let  p  be  any  line  in  33.  p  #  in  or  p  #  n. 

lip  #  m,  then  ^  co  p  (Lemma  1  1.1),  so  p  #  I. 

If  p  #  n,  then  B  co  p  (Lemma  1  1.1),  so  p  #  I. 

We  have  now  proved  that  /  co  93,  so  33  co  I  (Theorem  11). 

Let  now  at  least  one  of  the  p.  points  be  proper. 

THEOREM  16.     //  A  #  33  and  (£  co  AW,  then  W  co  Ad. 

PROOF:     AW  =  I,  A&  -  p. 

d  co  I,  so  /  co  &  [Th.  1  1],  so  /  #  p. 

Let  m  be  any  line  in  33,  m  #  /  or  m  #  p. 

If  m  #  I  we  choose  Q  on  m,  so  that  Q  oj  I,  AQ  =  q. 

m  intersects  I  v  m  #  q.  (Lemma  7.1). 

If  m  intersects  /,  then  33  is  proper,  33  —  B,  and  B  co  p  by  Th.  10,  so 
m  #p. 

If  m  #  q,  then  A  co  m,  so  m  #  p. 

We  have  proved  that  m  #  p  i  or  every  line  m  in  33,  so  p  co  W,  so  33  co  p 
(Th.  11). 


AXIOMS  FOR  INTUITIONISTIC  PLANE  AFFINE  GEOMETRY      171 

Note  that  Axiom  A8  has  not  been  used  in  the  proofs  of  Theorems  13, 
14,  15,  16. 

There  are  two  other  cases  of  P4  in  which  only  one  of  the  three  p.  points 
is  proper.  I  have  not  succeeded  in  proving  these  from  the  preceding 
axioms;  therefore  1  introduce  them  as  new  axioms: 

A9      If  A  #  93  and  g  «  /!»,  then  A  co  A(»,  g). 
A10     //»#<£  and  A  n  A(93,  g),  then  93  <o  Ad. 

It  follows  from  the  next  theorem  that  the  case  in  which  none  of  the 
three  p.  points  is  known  to  be  proper,  need  not  be  considered. 

THEOREM  17.  //  31  #  S3  and  g  o>  A(3l,  93),  then  at  least  one  of  the  p. 
points  81,  93,  g  is  proper. 

PROOF:     Choose  /  so  that  /  e  g,  /  co  81. 

By  A8,  there  is  a  p.  point  ®  so  that  /  e  ®,  ®  e  A(9t,  93).  ®  #  g;  by  Th.  10 
g  is  proper  or  $  is  proper.  If  2)  is  proper,  3)  =  />,  then  A(3l,  93)  =  Z)8l 
is  proper,  so,  again  by  Th.  10,  3t  is  proper  or  93  is  proper. 

It  remains  to  prove  the  uniqueness  of  the  p.  line  through  two  p.  points 
which  are  apart  from  each  other.  We  first  prove  some  theorems  about 
improper  p.  points. 

THEOREM  18:  //  31  and  93  are  improper  p.  points  and  31  #  93,  then 
any  line  in  3f  intersects  any  line  in  93. 

PROOF:  Let  /  be  a  line  in  31  and  m  a  line  in  $8.  Choose  p  in  31  so  that 
P  ft>  SJ,  and  choose  C  on  p.  C93  =  q\  then  q  #  p.  By  Th.  16,  31  o>  q,  so 
q  co  31,  so  /  #  q.  p  intersects  q,  so  that  we  infer  from  A3  that  /  has  a  point 
in  common  with  p  or  with  q. 

If  /  has  a  point  in  common  with  p,  then  /  cannot  be  apart  from  p, 
because  31  is  improper,  so  /  =  p,  so  /  intersects  q.  It  has  now  been  proved 
that  in  every  case  /  intersects  q.  Repeating  this  argument  for  93  instead  of 
31,  we  find  that  m  intersects  I. 

THEOREM  19:  //  31  #  93,  £  #  31  and  g  *  93  =  31  r»  93,  then  E  n  31  - 
31^93. 

PROOF  :     It  is  clear  that  3i  n  93  C  31  n  g. 
Let  /  be  a  line  in  31  ri  g ;  we  must  prove  that  /  e  93. 
Case  1 :     31  or  93  is  a  proper  p.  point.  Then  A(8l,  93)  is  a  proper  p.  line  m. 
m  E  31  r»  3J,  so  m  6  31  r»  g.  By  Th.  8,  /  =  m,  so  /  e  ». 

2 :     (General  case) .  Because  /  e  31  r»  g,  31  or  g  is  proper,  so  that 


172  A.    HEYTING 

we  may  now  assume  that  g  is  proper.  Choose  D  on  I  so  that  D  #  93. 
/)93  =  n.  We  shall  give  an  indirect  proof  for  n  =  /. 

Suppose  n  #  I',  then  it  is  impossible  that  51  or  93  is  a  proper  p.  point, 
for  in  case  1  we  know  that  /  e  93,  so  /  ==  D  93  =  n.  Thus  91  and  93  are  both 
improper.  It  follows  from  Th.  10  that  91  ^  93  =  0,  so  93  n  £  =  0,  so  £  is 
improper  (Al).  But  (£  is  proper;  this  contradiction  proves  that  n  #  I  is 
impossible,  so  n  =  I,  and  /  e  93. 

COROLLARY.     In  the  case  that  (£  #  %l  we  have 

l,  93)  <->9l^(£  =  91^93. 


THEOREM  20.     //  21  araZ  93  are  improper  p.  points,  and  91  #  93, 
A(9l,  93)  «  //&e  5^  of  .all  improper  p.  points. 

PROOF:  By  Th.  10,  91  n  93  =  0.  Let  $  be  any  p.  point  in  A(9i,  93)  ;  as 
91  #  $  or  93  #  $,  we  may  assume  that  81  #  $. 

Then,  by  Th.  19,  91  r*  ^5  =  91  ^  93  =  0,  so  ^  is  improper.  Conversely, 
if  <{$  is  improper,  we  have  by  Th.  10,  that  91  ^  $  =  0  =  91  n  93,  so 
$e  *(«,»). 

The  following  theorem  asserts  the  uniqueness  of  the  p.  point  of  which 
the  existence  is  affirmed  in  A8. 


THEOREM  21.     //  91  #  93,  E  fl^  5)  Wong  to  A(«,  93),  £  o> 
belongs  to  E  an^  to  2),  ^^n  £  =  *;£). 

PROOF:  Suppose  g  #  S).  As  g  #  91  and  5)  #  9t,  it  follows  from  Th. 
19  that  9l^93  =  9l^(£  =  9f^<®.  Thus,  if  91  n  93  contained  a  line  /,  we 
should  have  /  6  K  ^  3),  so,  by  Th.  8,  /  —  p\  but  this  contradicts  p  a*  91. 
It  follows  that  91  r*  93  =  0. 

Thus  91,  93,  E,  $  are  all  improper  p.  points  (Th.  20),  and,  as  (£  #  ®, 
(£  ^  ®  =  0,  contradicting  />  e  (£  n  ®.  We  have  proved  that  (£  #  3)  is 
impossible,  so  K  =  S). 

THEOREM  22.  //  91  #  93  <w<Z  *'/  */  w  impossible  that  g  co  A(«,  93),  then 
(S  eA(9I,  93). 

PROOF:  We  treat  the  case  that  (£  #  9(.  Choose  ^>  in  g  so  that  />  co  9(. 
By  A8,  there  is  a  p.  point  ®  so  that  p  e  ©,  ©  e  A(«,  93).  We  give  an 
indirect  proof  for  K  =  ©.  Suppose  that  E  #  ©.  As  £  e  £  n  ©,  (£  or  ©  is 
a  proper  p.  point  (Th.  10).  If  @  is  proper,  ©  =  G,  then  A(9(,  93)  is  proper, 
A(«,  93)  =  /;  G91  =  /,  Gg  =  p.  9t  co  ©S,  so  (g  co  G«,  K  co  A(«,  93),  which 
is  impossible  by  hypothesis.  If  K  is  proper,  E  =  C,  then  A(9l,  (£)  is  proper, 


AXIOMS  FOR  INTUITIONISTIC  PLANE  AFFINE  GEOMETRY     173 


(,  (£)  =  m.  Suppose  that  93  CD  m\  then  we  should  have  C  a>  A(9l,  S3) 
(A9),  which  is  impossible  by  hypothesis.  But  if  S3  e  m,  then  A  (2t,S3) 
=  w,  (£e  A  ($,S3),  (£  =  ©.  We  have  derived  a  contradiction  from  the 
hypothesis  that  (£  #  ©,  so  (£  =  ®,  and  (E  e  A(«,  58). 

Remark:  A  9  is  only  used  in  this  proof,  A  10  nowhere  in  this  paper. 
However,  A10  will  be  needed  for  the  derivation  of  further  theorems  of 
projective  geometry. 

The  next  theorem  asserts  P2  for  p.  points. 

THEOREM  23.     //«#»,«#  C  and   (£  e  A(«,  S3) 


PROOF:     We  first  prove  the  theorem,  making  the  extra  assumption 
that  A(«,  S3)  is  a  proper  p.  line  1.  In  this  case,  /  e  91  r*  S3,  so  A(2l,  S3)  = 
6  $}.  Moreover,  /  e  51  n  e,  so  A(«,  E)  -  {^|/  e  $}.  Thus  A(«,  »)  = 


In  the  general  case,  let  3)  be  any  p.  point  in  A(9f,  S3).  Suppose  that 
3)  w  A($,  6);  then,  by  Th.  17,  at  least  one  of  the  p.  points  «,  e,  3)  is 
proper,  so  that  A  (91,  S3)  is  proper,  but  in  this  case  we  have  already  proved 
that  A(«,  S3)  =  A(«,  ffi).  Thus  the  assumption  that  3)  o>  A(9l,  E)  has  led  to 
a  contradiction;  by  Th.  22,  ®eA(9(,  (£).  We  have  now  proved  that 
A(?l,  S3)  CA(«,  S).  In  particular,  S3  e  A(8l,  K).  Now  we  prove  by  the  same 
argument,  interchanging  S3  and  (£,  that  A(9l,  E)  C  A(9l,  »). 


Bibliography 

[1]    HEYTING,   A.,   Intuitionistische  axiomatiek  der  projectieve  meetkunde.   Thesis 

University  of  Amsterdam.  Groningen  1925. 
[2]    ,  Zur  intuitionistischen  Axiomatik  der  projektiven  Geometric.  Mathematische 

Annalen,  vol.  98  (1927),  pp.  491-538. 
[3]  ,  Die  formalen  Regeln  der  intuitionistischen  Logik,  Sitzungsberichte  prcuss. 

Akacl.  Wiss.  Berlin  (1930),  pp.  42-56. 
[4]    f  Die  formalen  Regeln  der  intuitionistischen  Mathematik,  Sitzungsberichte 

preuss.  Akad.  Wiss.  Berlin  (1930),  pp.  55-71. 
[5]    ,  Intuitionism,  an  introduction.  Amsterdam   1956. 


Symposium  on  the  Axiomatic  Method 


GRUNDLAGEN  DER  GEOMETRIE  VOM  STANDPUNKTE 
DER  ALLGEMEINEN  TOPOLOGIE  AUS 

KAROL  BORSUK 

Universitdt  Warschau,   Warschau,  Polen 

Das  Problem  die  Geometrie  axiomatisch  zu  begtiinden,  das  zum  erst  en 
Mai  von  Euklid  gestellt  und  gelost  wurde,  hat  bis  jetzt  seine  Aktualitat 
nicht  verloren.  Am  Ende  des  19  Jahrhunderts  hat  Hilbert  [9]  die  be- 
riihmte  Axiomatik  clcr  Geometrie  angegeben,  womit  er  cine  wesentliche 
Vertiefimg  und  Vervollstantligung  der  Ideen  von  Euklid  erzielt  und  der 
Geometrie  die  Gestalt  einer  deduktiven  Theorie,  im  modernen  Sinne, 
gegeben  hat.  Durch  die  Axiomatik  von  Hilbert  ist  auch  das  Verhaltnis 
zwischen  den  Kuklidischen  und  der  hiperbolischcn  Geometrie  von  Lo- 
batschefsky  uncl  Bolyai  endgiUtig  erklart. 

Dadurch  wurde  aber  clas  Problem  der  Grundlagen  der  Geometrie  kei- 
neswegs  auserschopft.  Einerseits,  ofnete  die  Entwicklung  der  mathe- 
matischen  Logik  weite  Moglichkciten  ftir  die  Untersuchung  der  logischen 
Struktur  der  Geometrie,  als  einer  deduktiven  Theorie.  In  dieser  Richtung 
geht  ein  bedeutender  Teil  der  modernen  Studien  auf  dem  Gebiete  der 
Grundlagen  der  Geometrie.  Anclererseits,  dutch  die  Einfiihrung  von  ver- 
schiedenen  Typen  der  allgemeinen  Raume  ist  das  Problem  entstanden, 
die  Lage  der  klassischcn  Raume  unter  samtlichen  abstrakten  Raumen 
aufzuklarcn. 

Das  Problem  die  klassischen  Geometricn  unter  samtlichen  Riemann- 
schen  Geometrien  zu  charaktcriesieren,  wurde  schon  im  19  Jahrhundert 
von  Sophus  Lie  [15]  gestellt.  Lie,  zu  den  friiheren  Idcen  von  Helmholtz  [8] 
ankniipfend,  hat  diesem  Problem  die  Gestalt  des  Problems  einer  Charak- 
terisierung  der  Gruppe  der  starren  Bewegungen  gegeben.  Aber  erst  die 
Entstehung  der  allgemeinen  Mengenlehrc  und  der  auf  ihr  gestiitzten 
axiomatischen  Theorie  det  allgemeinen,  abstrakten  Raume,  hat  dem 
Problem  der  Grundlagen  der  Geometrie  eine  wirklich  allgemeine,  moderne 
Gestallt  gegeben.  Von  diesem  allgemeinen  Standpunkte  aus  wurde  das 
von  Lie  gestelltes  Problem  die  Geometrie  mit  Hilfe  der  Gruppe  det 
starren  Bewegungen  zu  begriinden,  im  Jahre  1930  von  Kolmogoroff  [11] 
angegriffen.  Kolmogoroff  hat  ein  System  von  Axiomen  angegeben,  dutch 

174 


GEOMETRIE    UND    TOPOLOGIE  175 

das  die  Klasse  der  Riemannschcn  Raume  mil  der  konstanten  Krummung 
charaktcrisiert  wurde.  Leidcr  hat  Kolmogoroff  die  vollstandigen  Beweise 
seiner  Behauptungen  nicht  veroffentlicht  und  erst  in  den  letzten  Jahren 
sind  zwei  Arbeiten  von  Tits  [20],  [21]  und  dann  eine  Arbeit  von  Freuden- 
thal  [7]  erschienen,  in  den  eine,  im  gewissen  Sinne  endgiiltige  Losung  des 
Raumproblems  auf  dem  Boden  der  Charakterisierung  der  Bewegungs- 
gruppe  gegeben  ist.  In  der  Arbeit  von  Freudenthal  ist  auch  eine  wcit- 
gehende  Klassifikation  der  auf  diese  Weise  charakterisierten  Geometrien 
angegeben. 

In  eine  etwas  andere  Richtung  gehen  die  Arbeiten,  die  zu  einer  Charak- 
terisierung der  klassischen  Raume  auf  dem  Boden  der  durch  Menger  [17] 
entwickelten  allgemeinen  metrischen  Geometrie  streben.  Ausser  den  we- 
sentlichen  Ergcbnissen  von  Menger  [17],  [18],  soil  man  hier  die  Ergebnisse 
von  Wilson  [26],  von  Garret  Birkhoff  [2],  von  Blumenthal  [3],  von  H.  C. 
Wang  [25]  und  von  anderen  nennen.  Wilson  charakterisierte  die  Eukli- 
dischen  Mctriken  mit  Hilfe  eincr  gewissen  metrischen  Eigenschaft  jeder 
vier  Punkten  des  Raumes.  Garret  Birkhoff,  und  die  anderen,  stiitzen  ihre 
Untersuchimgen  auf  clem  Postulate  einer  metrischen  Homogenitat. 

In  dicsem  Vortrage  mochte  ich  mich  mit  einer  Charakterisierung  der 
klassischen  Raume  auf  dem  Boden  eincr  Klassifikation  der  allgemeinen 
topologischeri  Raume  bcschaftigen.  Es  handelt  sich  dabci  vor  allcm  urn 
cine  Formulicrung  des  Problems  und  um  die  Andcutung  der  hier  ent- 
stchcndcn  Schwierigkcitcn.  Ich  bin  nicht  imstande  eine  definitive  Losung 
des  Problems  anzugeben  und  ich  meine  sogar,  dass  wir  noch  fern  da  von 
sind.  Ich  mochte  nur  cine  parzielle  Losung  angebcn,  die  als  cine  Illustra- 
tion der  allgemeinen  Tendenz  dieser  Betrachtungen  dienen  kann. 

Um  die  Grundlagcn  der  Geometrie  auf  eincr  breiten  Basis  der  topolo- 
gischen  Eigcnschaften  zu  bauen,  braucht  man  eine  meh  rausgebaute  Sys- 
tematik  der  topologischcn  Raume  zu  bearbeiten.  Bis  jetzt  ist  eine  solchc 
Systcmatik  wenig  entwickelt.  Am  bcsten  ist  es  mit  der  Systematik  der 
allgemeinsten  Typen  der  topologischen  Raume.  Verschiedene  Axiome: 
der  Rcgelmassigkeit,  der  Normalitat,  der  Basis  und  so  weiter  erlaubcn 
gewisse  Klassen  von  Raumen  mit  mehr  oder  wcnigcr  reichem  geometri- 
schen  Inhalt  zu  definieren.  Aber  die  Klassifikation  von  den  mehr  speziel- 
len  Raumen  ist  bis  jetzt  hochst  mangelhaft  und  wir  sind  immer  fern  von 
der  topologischen  Bestimmung  der  wichtigen  Klasse  der  sogenannten 
Polyeder.  Ich  verstehe  dabei  hier,  unter  den  Polyedern,  solche  separable 
Raume,  fur  die  eine  lokal  endliche  Triangulation  exist iert.  Ich  glaube,  dass 
erst  eine  Entwicklung  der  Systematik  von  den  topologischen  Raumen, 


176  KAROL   BORSUK 

eine  angemessene  Grundlage  zur  genauen  Aufklarung  der  Natur  der  klas- 
sischen  Raume  schaffen  wird.  Da  aber  die  Geometric,  neben  topologi- 
schen, auch  metrische  Axiome  braucht,  so  scheint  mir,  dass  eine  fur  oben 
genannte  Zwecke  niitzliche  Axiomatik,  neben  topologischen  auch  me- 
trische Axiome  enthalten  soil  und  zwar  unter  der  Beriicksichtigung  fol- 
gendes  ,,Prinzip$  eines  topologisch-metrischen  Parallelismus" : 

PRINZIP  [jT||M],  Die  topologischen  Axiome  sollen  die  Existenz  einer  den 
metrischen  Axiomen  genilgenden  Metrik  implizieren.  Die  metrischen  Axiome 
sollen  die  Erfullung  der  topologischen  Axiome  implizieren. 

Die  wohlbekannten  Schweirigkeiten  bei  Versuchen  die  Euklidischen 
Raume  topologisch  zu  charakterisieren  haben  zur  Folge,  dass  eine  voll- 
standige  Realisierung  des  Prinzips  [T||M],  bei  aktuellem  Niveau  der  To- 
pologie,  eher  aussichtlos  ist.  Man  kann  aber  den  Entwurf  einer  Axiomatik 
der  Euklidischen  Geometrie  angeben,  einer  Axiomatik,  die  wenigstens 
teilweise  dieses  Prinzip  beriicksichtigt. 

Es  ist  zweckmassig  unsere  Axiome  in  drei  folgende  Gruppen  zu  teilen : 

I.  Axiome  der  allgemeinen  Raume. 
II.  Axiome  der  Regelma'ssigkeit. 
III.  Spezielle  Axiome. 

Als  die  zur  ersten  Gruppe  gehorenden  Axiome  kann  man  irgendeine 
Axiome,  die  die  metrisicrbaren,  separablen  Raume  charakterisieren, 
wahlen.  Man  kann,  zum  Beispiel,  die  drei  Axiome  der  abgeschlossenen 
Hiille  (von  Kuratowski  [13],  [14]),  das  Axiom  der  Normalitdt  und  das 
Axiom  der  abzdhlbaren  Basis  nehmen.  Wenn  wir,  wie  ublich,  die  abge- 
schlossene  Hiille  der  Menge  X  mit  X  bezeichnen,  so  hat  die  erste  Gruppe 
der  topologischen  Axiome  die  folgende  Gestallt  (siehe  [14]): 

AXIOME  (/,  T). 

(i)  X^TY  =  Zv  7. 

(2)  Falls  X  leer  oder  einpunktig  ist,  so  ist  X  =  X. 

(3)  X  =  X. 

(4)  Fur  je  zwei  disjunkte,  abgeschlossene  Mengen  X  und  Y  gibt  es  eine 
offene  Menge  G  von  der  Art,  dass  X  C  G  und  G  ^  Y  —  0  ist  (Axiom  der 
NORMALITAT). 

(5)  Es  gibt  eine  Folge  {Gn}  von  offenen  Mengen  von  der  Art,  dass  jede 
offene  Menge  Vereinigungsmenge  gewisser  Mengen  dieser  Folge  ist  (Axiom 

der  ABZAHLBAREN  BASIS). 


GEOMETRIE    UND   TOPOLOGIE  177 

Als  die  entsprechende  Gruppe  der  metrischen  Axiome  nehmen  wir  die 
drei  Axiome  der  metrischen  Rdume  von  Frechet  (mit  einer  Modifikation 
von  Lindenbaum  [16])  und  das  Axiom  der  Separabilitdt.  Wenn  wir,  wie 
iiblich,  die  Entfernung  von  dem  Punkte  x  bis  dem  Punkte  y  mit  Q(X,  y) 
bezeichnen,  so  hat  die  erste  Gruppe  der  metrischen  Axiome  die  folgende 
Gestalt: 

AXIOME  (I,  M). 

(1)  p(x,  y)  ist  reel. 

(2)  p(x,  y)  =  0  dann  und  nur  dann  wenn  x  —  y. 

(3)  p(x,y)+p(x,z)  >p(y,z). 

(4)  Es  gibt  eine  im  Raume  dichte,  hochstens  abzdhlbare  Teilmenge  (Axiom 
der  SEPARABILITAT). 

Der  wohlbekannte  Metrisationssatz  von  Urysohn  [23]  besagt,  dass  die 
Axiome  (/,  T)  die  Existenz  einer  den  Axiomen  (/,  M)  geniigenden  Metrik 
implizieren.  Auch  umgekehrt,  die  Existenz  einer  Metrik,  die  den  Axio- 
men (/,  M)  geniigt,  hat  zur  Folge  die  Erfullung  samtlicher  Axiome  der 
Gruppe  (/,  T).  Somit  bei  diesen  Axiomen  ist  das  Prinzip  des  topologisch 
metrischen  Parallel ismus  erflillt. 

Die  Raume,  die  der  erst  en  Gruppe  der  Axiome  geniigen,  bilden  eine 
wichtige  und  gut  bekannte  Klasse.  Aber  diese  Klasse  ist  so  allgemein, 
dass  sie  auch  viele  Raume  mit  recht  komplizierten  und  wenig  anschau- 
lichcn  Eigenschaften  enthalt.  Die  zweite  Gruppe  von  Axiomen  soil  unter 
den  allgemeinen  metrischen  Raumen  eine  Klasse  von  Raumen  mit  beson- 
ders  einfachen,  anschaulichen  Eigenschaften  bestimmen.  Diese  Axiome 
sollen  somit  verschiedene,  sogenannte  pathologische  Phdnomene  elimineren 
[5].  Zusammen  mit  den  Axiomen  der  ersten  Gruppe,  sollen  sie  eine  topo- 
logische  Grundlage  fur  jede  ,,vernunftige"  Geometrie  bilden.  Im  Gegen- 
satz  zu  der  genau  bestimmten  ersten  Gruppe  von  Axiomen,  die  Aufst el- 
lung  der  Axiome  der  zweiten  Gruppe  ist  wenig  bestimmt.  Es  scheint,  dass 
diese  Axiome  vor  allem  die  lokalen  Eigenschaften  des  Raumes  anbetreffen 
sollen  und  eine  Klasse  der  Raume  definieren,  die  hinreichend  umfangreich 
sein  soil  um  alle  Polyeder  zu  enthalten,  aber  hinreichend  speziell,  um  alle 
Raume  mit  paradoxalen  Eigenschaften  ausschliessen.  Ich  werde  hier  diese 
Gruppe  von  Axiomen  nur  provisorisch  folgendermassen  aufstellen: 

AXIOME  (II,  T). 

(1)  Lokale  Kompaktheit. 

(2)  Lokaler  Zusammenhang. 


178  KAROL    BORSUK 

Sicher  1st  die  so  aufgestellte  Axiomgruppe  nicht  hinreichend  um  die 
,,verniinftige  Raume"  zu  charakterisieren.  Fur  unseren  bescheidenen 
Zweck,  einen  sehr  unvollkommenen  Prototypus  einer  topologisch-metri- 
schen  Axiomatik  zu  geben,  wird  sie  aber  genugen. 

Die  entsprechende  Gruppe  der  metrischen  Axiome  besteht  aus  zwei 
Axiome : 

AXIOME  (II,  M). 

(1)  Kompaktheit  der  beschrdnkten  abgeschlossenen  Mengen. 

(2)  Lokale  Konvexitdt. 

Man  sagt  dabei,  dass  ein  Kaum  X  lokal  konvex  ist,  wenn  fur  jeden 
Punkt  a  e  X  eine  Umgebung  U  existiert  von  der  Art,  dass  fiir  je  zwei 
Punkte  x,  y  e  U  mindestens  einen  Punkt  z  e  X  gibt,  fiir  den 

P(x,  z)  -  P(y,  z)  =  \  -p(x,  y) 

gilt.  Jeder  solche  Punkt  z  soil  ein  Mittelpunkt  des  Paares  x,  y  heissen. 

Es  ist  bckannt  (vgl.  Menger)  dass  in  einem  metrischen  Raume  A", 
in  dcm  die  Axiome  (II,  M)  erfiillt  sind,  gibt  es  fiir  jeden  Punkt  a  e  X 
eine  positive  Zahl  ?  von  der  Art,  dass  jeder  Punkt  x  e  X  —  (a),  der  erne 
Entfernung  von  a  kleiner  als  r  hat,  mit  a  (lurch  eine  geradlinige  Strecke 
verbunden  sein  kann.  Daraus  ergibt  sich  ohne  Wciteres,  dass  die  Axiome 
(II,  M}  die  Axiome  (II,  T)  zur  Folge  haben.  Ob  auch  umgekehrt,  die 
Axiome  (II,  7^)  (zusammen  mit  den  Axiomen  cler  crstcn  Gruppe)  die 
Exist enz  einer  den  Axiomen  (II,  M)  geniigenden  Metrik  zur  Folge  haben, 
ist  noch  nicht  endgiiltig  aufgeklart.  Die  Ergebnissc  von  Bing  [1J,  der  die 
von  Menger  ausgesprochene  Vermutung  der  konvexen  Metrisierbarkeit 
der  lokal  zusammenhangenden  Kontinua  bewiesen  hat,  und  auch  die 
neulich  erhaltenen  Ergebnisse  von  japanischen  Mathematikern  Tanaka 
und  Tominaga  [19],  [22],  erlauben  aber  zu  vermuten,  dass  bei  diesen 
Axiomen  das  Prinzip  [7]|M]  erfiillt  wird.  Da  aber  die  zweite  Gruppe  der 
Axiome  sicher  nicht  als  definitiv  aufgestellt  gelten  kann,  so  meine  ich 
dass  die  wesentlichen  Schwierigkeiten  erst  bei  einer  angemessenen  Ver- 
vollstandigung  dieser  Gruppe  der  Axiome  erscheinen  werden. 

Die  Axiome  der  dritten  Gruppe  sollen  die  spezifischen  Eigenschaften 
des  anbetreffenden  Raumes  angeben.  Falls  wir  das  Prinzip  des  topolo- 
gisch-metrischen  Parallelismus  realisieren  wollen,  so  sollen  die  Axiome 
(III,  7'),  zusammen  mit  den  Axiomen  (I,  T)  und  (II,  T),  den  bctrachteten 
Raum  topologisch  vollstandig  charakterisieren.  Eine  topologische  Cha- 
rakterisierung  der  Euklidischen  Ebene  war  schon  vor  vielen  Jahren  von 


GEOMETRIE    UND   TOPOLOGIE  179 

van  Kampen  [10]  gegeben.  Somit,  in  diesem  Spezialfalle,  bietet  eine  Auf- 
stellung  der  dcm  Prinzip  [7]|Af]  geniigenden  Axiomatik  keine  Schwierig- 
kciten.  Eine  ahnliche  Axiomatik  fur  hoherdimensionale  Euklidische 
Raume  zu  finden  ist  eine  unvergleichlich  schwierigcre  Aufgabe.  Es  zeigt 
sich  aber  moglich,  die  Axiomgruppen  (III,  2')  und  (III,  M)  so  anzu- 
geben,  dass  sic  mitgesamt  und  mit  den  Axiomen  der  erst  en  und  zweiten 
Gruppe  eine  vollstandige  Axiomatik  der  elementaren  Geometric  bilden. 
Das  Prinzip  [71  |M]  wird  dabei  vernachlassigt,  weil  die  von  ihm  verlangte 
Aussonderung  der  topologischen  und  metrischen  Axiome  nicht  erflillt 
wird.  Die  Schwierigkeit  eine  solche  Aussonderung  zu  realisieren  liegt 
hauptsachlich  auf  der  topologischen  Seite.  Somit  wird  man  desto  naher 
zu  der  Realisierung  des  Prinzipes  [2]|M]  kommen,  je  reicher  der  Inhalt 
der  topologischer  Axiome  (III,  T)  wird.  Anders  gesprochen,  ist  es  zweck- 
massig  die  Rolle  der  metrischen  Axiome  (III,  M)  moglichst  weit  zu 
reduzieren. 

Um  die  Axiome  der  Gruppe  (III,  T)  fiir  Euklidische  Raume  zu  formu- 
lieren,  werde  ich  folgenden  Begriff  benutzen : 

Ein  wahrer  Zyklus  (im  Sinne  von  Victor  is  [24])  y  ist  im  Raume  X  mit 
dcm  Punkle  x  E  X  verschlungen,  wenn  y  einen  kompakten  Trager 
A  C  X  —  (x)  hat  und  y  nicht  homolog  Null  in  der  Menge  X  —  (x)  ist. 

Nun  besteht  die  Axiomgruppe  (III,  T)  von  vier  folgenden  Axiomen: 

AXIOME  (III,  T). 

(1)  Der  Raum  ist  zusammenhdngend. 

(2)  Die  Dimension  des  Raumes  ist  gleich  n. 

(3)  In  jeder  Umgebung  jedes  Punktes  des  Raumes  gibt  es  einen  wahren 
Zyklus,  der  in  dem  Raume  mit  diesem  Punkte  verschlungen  ist. 

(4)  Fur  jeden  wahren  Zyklus  des  Raumes  ist  die  Menge  der  Punkte  mit 
den  dieser  Zyklus  verschlungen  ist,  of  fen. 

Offenbar  sind  die  Axiome  der  Gruppen  (I,  T),  (II,  T),  (III,  7^)  nicht 
hinreichend  um  den  Euklidischen  w-dimensionalen  Raum  topologisch  zu 
charakterisieren.  Sie  sind  erfiillt,  zum  Beispiel,  durch  jede  offene  n-di- 
mensionale  Mannigfaltigkeit  und  auch  durch  verschiedene  andere  Raume. 
Man  kann  sie  in  verschiedene  Weisen  verstarken.  Da  wir  aber,  wie  ich 
schon  gesagt  habe,  eine  rein  topologische  Charakterisiering  des  n-di- 
mensionalen  Euklidischen  Raumes  nicht  angeben  konnen,  sind  wir  ge- 
zwungen  das  Prinzip  \_T\\M}  vernachlasigend,  gewisse  metrische  Bedin- 
gungen  einzufuhren.  Um  den  w-dimensionalen  Euklidischen  Raum  voll- 


180  KAROL    BORSUK 

standig  zu  charakterisieren,  geniigt  es  zu  den  oben  genannten  Axiomen 
folgendes  metrische  Axiom  (III,  M)  hinzufiigen: 

AXIOM  (III,  M). 

Jede  vier  Punkte  des  Raumes  sind  zur  gewissen  vier  Punkten  des  Eukli- 
dischen  3-dimensionalen  Raumes  E$  kongruent. 

Dieses  Axiom,  das  von  Menger  [17]  formuliert  und  von  Wilson  [26]  und 
den  anderen  benutzt  war  wird,  wie  iiblich,  Vierpunktebedingung  genannt. 

Die  aus  den  drei  Gruppen  der  topologischen  und  der  metrischen  Axiome 
bestehende  Axiomatik  des  Euklidischen  n-dimensionalen  Raumes  ist  in 
Wirklichkeit  nur  eine  Modifikation  der  rein  metrischen  Axiomatik,  die  im 
Jahre  1932  von  Wilson  [26]  angegeben  war.  Der  Zweck  dieser  Modifika- 
tion war,  die  metrischen  Axiome  im  moglichst  grossen  Masse  durch  die 
topologischen  zu  ersetzen  um  somit  zur  Realisierung  des  Prinzips  [7]|Af] 
naher  zu  kommen. 

Nun  werden  wir  zeigen,  dass  unsere  Axiomatik  den  Euklidischen  n- 
dimensionalen  Raum  En  vollstandig  charakterisiert. 

Aus  den  Axiomen  der  ersten  und  zweiten  Gruppe  folgt,  dass  fiir  jeden 
Punkt  a  e  X  eine  offene  Umgebung  U  existiert  von  der  Art,  dass  jeder 
Punkt  b  e  U  mit  a  durch  eine  geradlinige  Strecke  sich  vereinigen  lasst. 
Wir  konnen  dabei  annehmen,  dass  diese  Umgebung  beschrankt  ist.  Aus 
der  Vierpunktbedingung  ergibt  sich,  dass  die  a  und  b  verbindende  Strecke 
eine  einzige  ist.  Wir  werden  sie  mit  ab  bezeichnen.  Da  U  beschrankt  ist, 
ist  auch  die  Vereinigung  allcr  dieser  Strecken  beschrankt.  Wir  schliessen 
leicht,  indem  wir  das  erste  von  den  Axiomen  (II,  M)  anwenden,  dass  die 
Strecke  ab  stetig  von  ihren  Endpunkten  a  und  b  abhangt. 

Um  nun  zu  zeigen,  dass  jede  Strecke  ab  sich  aussen  den  Endpunkt  b 
verlangern  lasst,  betrachten  wir  eine  Umgebung  V  des  Punkt  es  b  so  klein, 
dass  sie  den  Punkt  a  nicht  enthalt  und  dass  je  zwei  ihre  Punkte  sich  durch 
eine  eindeutig  bestimmte  und  von  ihren  Endpunkten  stetig  abhangende 
Strecke  vereinigen  lassen.  Nach  dem  dritten  der  Axiome  (III,  T)  gibt  es  in 
der  Umgebung  V  einen  wahren  Zyklus  y  der  mit  b  verschlungen  ist 
(vgl.  Abb.  1).  Betrachten  wir  einen  kompakten  Trager  B  C  V  —  (b)  und 
einen  Punkt  c  e  ab  ^  F,  der  von  b  verschieden  ist.  Man  sieht  leicht,  dass 
der  wahre  Zyklus  y  in  der  Vereinigungsmenge  aller  Strecken  ex  mit 
x  6  B,  homolog  Null  ist.  Da  y  nicht  homolog  Null  in  der  Menge  X  —  (b) 
ist  schliessen  wir,  dass  es  einen  Punkt  b'  e  B  gibt  von  der  Art,  dass  die 
Strecke  cb'  den  Punkt  b  enthalt.  Mit  Hilfe  der  Vierpunktebedingung 


GEOMETRIE    UND   TOPOLOGIE 


181 


schliessen  wir,  dass  ac  v  cb'  die  gesuchte  Verlangerung  der  Strecke  ab  ist. 
Daraus  crgibt  sich  leicht  (durch  Anwendung  des  ersten  der  Axiome 
(II,  M))  dass  es  eine  Halbgerade  gibt,  die  a  als  ihren  Endpunkt  hat  und 
den  Punkt  b  enthalt.  Mil  Hilfe  des  ersten  der  Axiome  (II,  M)  und  der 
Vierpunktebedingung  zeigt  man  ferner,  dass  diese  Halbgerade  stetig  vom 
Punkte  b  abhangt  in  dem  Sinne,  dass  der  um  5  >  0  von  a  entfernte  Punkt 
dieser  Halbgerade  stetig  von  s  uncl  b  abhangt. 


Fig.  1 

Nun  werden  wir  zeigen,  dass  es  firr  jeden  Punkt  %  e  A"  eine  Halbgerade 
H  gibt,  die  a  als  ihren  Endpunkt  hat  und  den  Punkt  x  enthalt.  Nach  dem 
zweiten  der  Axiome  (II,  M)  gibt  es  eine  Zahl  r  >  0  von  der  Art,  dass  eine 
solche  Halbgerade  fiir  alle  Punkte  %  mil  p(a,  x)  <  r  existiert.  Es  bezeichne 
Q  die  Vollkugel  um  a  mit  dem  Radius  r.  Wir  setzen  voraus,  dass 

p(a,  x)  >  r  >  0 

gilt.  Da  der  Raum  X  zusammenhangend  (Axiom  (III,  T),  1),  lokal  zu- 
sammenhangend  (Axiom  (II,  T)  2)  und  lokal  kompakt  (Axiom  (II,  T),  1) 
ist,  gibt  es  in  X  einen  einfachen  Bogen  B,  der  a  und  x  vereinigt.  Es  sei  s 
eine  Zahl,  die  grosser  als  der  Diameter  von  B  ist.  Nach  dem  dritten  der 
Axiome  (III,  T)  gibt  es  in  Q  einen  wahren  Zyklus  y,  der  mit  a  verschlungen 
ist.  Es  bezeichne  A  den  in  Q  —  (a)  enthaltenen,  kompakten  Trager  von  y 
(vgl.  Abb.  2).  Da  Q  in  sich  selbst  zu  dem  Punkte  a  zusammenziehbar  ist, 
ist  dieser  Zyklus  in  Q  homolog  Null.  Somit  ist  er  mit  dem  Punkte  x  nicht 
verschlungen. 

(i)  y  ist  verschlungen  mit  a, 
(ii)  y  ist  nicht  verschlungen  mit  x. 

Nun  nehmen  wir  an,  dass  keine  der  Halbgeraden  mit  dem  Endpunkt  a 


182  KAROL    BORSUK 

den  Punkt  x  enthalt.  Fur  jeden  Eckpunkt  p  des  Zyklus  y  bczeichnen  wir 

mit  y(p)  den  um  s  von  a  entfcrnten  Punkt  der  Halbgerade  ap.  Da  die 
betrachtete  Halbgerade  stetig  vom  Punkte  p  abhangt  schliessen  wir,  dass 
(p  den  wahren  Zyklus  y  auf  einen  wahren  Zyklus  y*  abbildet.  Dabei  sind 
die  wahren  Zyklen  y  und  y*  in  der  Vcreinigungsmenge  aller  Strecken 
P<p(P)>  w°  P  £  A,  homolog.  Da  aber  diese  Vereinigungsmenge  keinen  der 


Fig.  2 

Punkte  a  und  x  enthalt,  schliessen  wir,  dass 

(iii)  y  ~  y*  in  X  —  (a)  —  (x) 
gilt.  Daraus  und  aus  (i)  und  (ii),  folgt 

(iv)  y*  ist  verschlungen  mit  a, 
(v)  y*  ist  nicht  verschlungen  mit  x. 

Aber  der  Zyklus  y*  liegt  auf  der  Oberflache  5  der  Kugel  um  den  PunlA 
a  mit  dem  Radius  s.  Da  s  von  dem  Diameter  6(B)  von  B  grosser  ist  folgt, 
dass  B  mit  S  punktfremd  ist.  Wenn  wir  nun  (iv)  und  das  vierte  der  Axiome 
(III,  T)  beachten,  so  sehen  wir  dass  der  Zyklus  y*  mit  dem  Punkte  x 
verschlungen  sein  soil,  was  der  Bedingung  (v)  widerspricht. 

Somit  haben  wir  gezeigt,  dass  je  zwei  verschiedene  Punkte  unseres 
Raumes  auf  einer  Geraden  liegen.  Wie  aber  W.  A.  Wilson  gezeigt  hatte 
[26]  ist  ein  metrischer,  w-dimensionaler,  separabler  Raum,  in  dem  je  zwei 


GEOMKTRIE    UND    TOPOLOGIE  183 

Punkte  auf  eincr  Geraden  liegen,  wobei  das  Axiom  (III,  M)  erfiillt  1st, 
mit  clem  n-dimcnsionalen  Euklidischcn  Raume  En  kongruent.  Somit 
sehen  wir,  class  unscrc  Axiome  den  Raum  En  vollstandig  charakterisieren. 

Der  Hauptmangel  der  angcgcbenen  Axiomatik  besteht  darin,  dass  in 
der  dritten  Gruppe  der  Axiome  die  topologischen  und  die  metrischen 
Voraussctzungen  nicht  getrennt  sind.  Ich  habe  schon  erwahnt,  dass  bei 
dem  aktucllcn  Stande  der  Topologie  diese  Trennung  als  aussichtslos 
bctrachtet  werclen  kann. 

Im  Falle  der  Ebene  stellt  sich  aber  die  Sache  anders.  In  diesem  Falle 
konnen  wir  die  von  van  Kampen  angegebene  topologische  Charakteri- 
sierung  der  Euklidischen  Ebene  verwenden  [10J.  Ich  werde  diese  Axio- 
matik in  einer  etwas  modi fizier ten  Gestalt  angeben,  um  die  Verwendung 
der  in  der  ursprunglichen  Axiomatik  von  van  Kampen  gebrauchten 
speziellen  Bcgriffen  eines  einfachen  Bogens  und  einer  einfachen  geschlos- 
senen  Kurve  zu  vermeiden.  In  dieser  modifizierten  Gestalt,  besteht  die 
topologische  Axiomatik  der  Ebene  aus  den  Axiomen  (I,  T),  (II,  T)  und 
aus  der  folgenden  Gruppe  der  speziellen  Axiome: 

AXIOME  (III,  T)2. 

(1)  Der  Raum  ist  zmammenhdngend. 

(2)  Der  Raum  ist  nicht  kompakt. 

(3)  Jeder  kompakter  Schnitt  des  Raumes  ist  nicht  azyklisch. 

(4)  Jedes  Teilkompaktum  des  Raumes,  das  nicht  azyklisch  in  der  Dimen- 
sion 1  ist,  ist  ein  Schnitt. 

Um  das  Prinzip  [T||Af]  zu  realisieren,  geniigt  es  nun  als  metrische 
Axiome  die  folgenden  Axiome  nehmen : 

AXIOME  (III,  Af)2. 

(1)  Je  zwei  Punkte  liegen  auf  einer  Geraden. 

(2)  Vierpunktebedingung. 

Zum  Schluss  dieses  Vortrages  mochte  ich  einige  Bemerkungen  all- 
gemeines  Natur  hinzufiigen.  Das  Problem  der  Grundlagen  der  Geometrie 
habe  ich  hier  als  ein  Fragment  des  allgemeinen  Problems  der  Klassifikation 
der  topologischen  Raume  aufgefasst.  Von  diesem  Standpunkte  aus  soil 
man  auch  die  hier  angegebenen  drei  Axiomgruppen  betrachten.  Ich  habe 
schon  bemerkt,  dass  die  zweite  Gruppe,  die  ich  die  Gruppe  der  Regel- 
mdssigkeit  genannt  habe,  hier  nur  provisorisch  aufgestellt  war.  Sie  soil  un- 
ter  samtlichen  topologischen  Raumen  eine  Klasse  von  Raumen  mit  be- 


184  KAROL   BORSUK 

senders  regelmassigen  Eigenschaften  bestimmen.  Eine  voile  topologische 
Charakterisierung  der  Klasse  von  Polyedern  bietet  sehr  wesentliche 
Schwierigkeiten  —  da  dadurch  eine  t)berbriickung  des  Abgrundes  zwi- 
schen  den  axiomatisch  definierten  abstrakten  topologischen  Raumen  und 
den  durch  Konstruktion  definierten  Figuren  der  element aren  Geometrie 
erzielt  wiirde.  Nur  im  Falle  der  hochstens  zweidimensionalen  Polyedern 
ist  cs  neulich  gclungen  diese  Schwierigkeiten  zu  uberwinden  (Kosinski 
[12]).  Dagegen  ist  aiich  in  dem  allgemeinen  Falle  eine  axiomatische  Auf- 
fassung  eines  gewissen  Teiles  der  topologischen  Eigenschaften  der  Po- 
lyeder  sicher  moglich  (vgl.  [4]).  Es  entsteht  aber  die  Frage,  wie  man  diesen 
,, gewissen  Teil"  dofinicren  soil.  Diese  Frage  ist  eng  mit  der  Frage  der  ver- 
nunftigen  Klassifikation  der  topologischen  Jnvarianten  verbunden. 

Seit  dem  Erlanger-Programm  von  Felix  Klein,  klassifiziert  man  ver- 
schiedcne  geometrische  Eigenschaften  vom  Standpunkte  der  Klassen  der 
Abbildungen,  gegeniiber  denen  diese  Eigenschaften  invariant  sind.  Wcnn 
man  die  Homoomorphien  durch  eine  allgcmeinere  Klasse  von  gewissen 
stctigcn  Abbildungen  &  ersetzt,  so  untcrschcidct  man  untcr  samtlichen  to- 
pologischen Eigenschaften  eine  engere  Klasse  von  den,  gegeniiber  den  zu 
ft  gehorenden  Abbildungen  in  variant  en  Eigenschaften.  Dicsc  Eigenschaf- 
ten wcrdcn  wir  ft-Invarianten  nennen  (\gl.  [4]).  t)ber  die  Klasse  SI  werden 
wir  nur  voraussetzen,  dass  die  Zusammensetzung  zweier  ihr  angchorcn- 
den  Abbildungen  wieder  zu  ihr  gehort. 

Wir  werden  sagen,  dass  zwei  Raume  X  und  Y  zum  dcnselben  St-Typus 
gehoren,  wcnn  sic  diesclben  Sl-Eigenschaften  habcn.  Es  ist  leicht  zu  be- 
merken,  dass  zwei  Raume  X  und  Y  dann  und  nur  dann  zu  demselben 
®-Typus  gehoren,  wenn  cs  zwei  $- Abbildungen 

/:  X >  Y  und  g:   Y >X 

auf  auf 

gibt. 

Betrachten  wir  einige  Beispiele : 

1.  Es  bezeichne  ft  die  Klasse  samtlicher  stetigen  Abbildungen.  Zu 
den  S-Invarianten  gehoren  dann  zum  Beispiel:  Kompaktheit,  Separabi- 
litat,  Zusammenhang  und,  fur  Kompakte,  auch  der  lokale  Zusammen- 
hang. 

2.  Viel  interessanter  ist  der  Fall,  wo  ft  die  Klasse  aller  sogenannten 
r- Abbildungen  ist.  Man  versteht  dabei  unter  einer  r-Abbildung  eine  stetige 
Abbildung 

/:  X >  Y 

auf 


GEOMETRIE    UND    TOPOLOGIE  185 

fur  die  eine  stetige  rechtsseitig  inverse  Abbildung  exist iert,  dass  heisst 
eine  Abbildung  g 

g:  Y »X, 

in 

die  der  Bedingung  fg(y)  =  y  fur  jeden  Punkt  y  e  Y  geniigt.  Die  Klasse  der 
Invarianten  von  r- Abbildungen  ist  sehr  reich,  uncl  somit  weisen  zwci  clem- 
selben  r-Typus  angehorende  Raume  eine  weitreichende  Ahnlichkeit  ihrer 
Eigenschaften  auf. 

Ahnlicherweise  kann  man  die  Invarianten  von  samtlichen  offencn  Ab- 
bildungen,  oder  samtlichen  stetigen  Abbildungen  mit  endlichcn  Urbild- 
mengen,  ocler  samtlichen  stetigen  Abbildungen  mit  azyklischen  Urbild- 
mcngcn  und  so  weiter  betrachten.  Jeder  solchen  Invariantcnklassc  ent- 
spricht  die  Eintcilung  samtlichcr  Raumc  in  cntsprechende  St-Typen. 

Da  die  voile  topologische  Charakterisierung  der  Klasse  der  Polyeder  cher 
hoffnungslos  ist,  scheint  es  zweckmassig  zu  sein,  gewissc  Charakterisirrung 
von  Polyedern  voni  Standpunkte  von  vcrschiedenen  ft-Klasseii  aus  zu 
betrachten.  Vom  Standjmnkte  der  r-Invarianten  aus  lassen  sich  die 
Polyeder  als  metrisierbarc,  separable  Raume  durch  folgonde  Bedingungen 
charakterisieren : 

1 .  Lokale  Kompaktheit. 

2.  Lokale  Zusammenziehbarkeit. 

3.  In  jedem  Punkte  eine  endliche  Dimension. 

Somit  konnen  wir  diese  drei  Eigenschaften  als  die  Axiome  der  Regel- 
massigkeit,  vom  Standpunkte  der  Theorie  der  r-Invarianten  aus,  be- 
trachten. 

In  ahnlicher  Weise  kann  man  bei  den  topologischen  Axiomen  der  dritten 
Gruppe,  anstatt  der  vollen  topologischen  Charakterisierung,  eine  relative 
Charakterisierung,  das  heisst  eine  Charakterisierung  im  Sinne  eines 
gewissen  ^-Typus  verlangen.  Man  kann  erwarten,  dass  eine  systematische 
Klassifikation  der  topologischen  Invarianten  erlauben  wird,  auf  diesem 
Wege  die  Lage  der  klassischen  Raume  unter  den  allgemeinen  topolo- 
gischen Raumen  klar  zu  bestimmen. 


186  KAROL    BORSUK 

Bibliographic 

1 1]     BING,    R.   H.,   Partitioning  a  set.   Bulletin  of  the   American   Mathematical 

Society,  Bd.  55  (1949),  S.  1101-1110. 

[2]     BIKKIIOFF,  Garrett,  Metric  foundations  of  geometry  I.  Transactions  of  the  Ame- 
rican Mathematical  Society,  Bd.  55  (1944),  S.  465-492. 
[3]     BLUMENTHAL,  L.,  Theory  and  applications  of  distance  geometry.  Oxford  1953, 

S.  1-347. 

[4]     BORSUK,  K.,  On  the  topologv  of  retracts.  Annals  of  Mathematics,  Bd.  48  (1947), 
S.  1082-1094. 

[5]    ,    Sur   I 'elimination   de   phenomenes   paradoxaux   en   topologie   generate. 

Piocecdings  of  the  International  Congress  of  Mathematicians,  Band  I,  Am- 
sterdam 1954,  S.  1-12. 

[6]    FRECIUCT,  M.,  Les  espaces  abstraits.  Pans  1928,  S.  Xf  +  296. 
[7]     FRKUDENTHAL,  H.,  Neuere  Fassungen  des  Riemann-Helmholz-Lieschen  Raum- 

problems.  Mathcmatischc  Zcitschnft,  Bd.  63  (1956),  S.  374-405. 
[8]     HELMHOLTZ,  H.,  Ober  die  tatsdchliche  Grundlagen  der  Geometrie.  Wissenschaft- 

liche  Abhandlungen,  Bd.  II  (1883),  S.  610-617. 
[9]     HIT.BERT,  D.,  Grundlagen  der  Geometrie,  Leipzig  1900. 
[10]     KAMPEN  VAN,  E.   R.,  On  some  characterization  of  2-dimensional  manifolds. 

Duke  Mathematical  Journal,  Bd.  1  (1935),  S.  74. 

[11]     KOLMOGOROFF,  A.,  Zur  topologisch-gruppentheoretischen  Begriindung  der  Geo- 
metrie. Nachrichtcn  von  der  Gesellschaft  der  Wissenschaften  zu  Gottingen, 
Mathematisch-Physikalische  Klasse  (1930),  S.  208-210. 
[12]     KOSINSKI,  A.,  A  topological  characterization  of  2-polytopes.  Bulletin  de  1'Aca- 

demie  Polonaise  des  Sciences,  Cl.  Ill,  Bd.  II  (1954),  S.  321-323. 
[13]     KURATOWSKI,  C.,  Sur  V operation  A  de  V Analysis  Situs.  Funclamcnta  Mathe- 
maticae,  Bd.  3  (1922),  S.  182-199. 

[14]    ,  Topologie  I,  Monografie  Matematyczne,  Warszawa  1952,  S.  XI  -f-  450. 

[15]    LIE,  S.,   Ober  die  Grundlagen  der  Geometrie.  Gesammelte  Abhandlungen  II 

(1922),  S.  380-468. 
[16]    LINDENBAUM,  A.,  Contribution  a  V etude  de  I'espace  metrique  I.  Fundamenta 

Mathematicae,  Bd.  8  (1926),  S.  209-222. 

[17]  MKNGER,  K.,  Untersuchungen  uber  allgemeine  Metrik.  Mathematische  An- 
nalen,  Bd.  100  (1928),  S.  75-163. 

[18]    ,   Geometrie  generate.  Memorial  des  Sciences  Math6matiques.   Bd.    124, 

Paris  1954,  S.  1-80. 

[19]  TANAKA,  Tadashi  and  TOMINAGA,  Akira.  Convex ification  of  locally  connected 
generalized  continua.  Journal  of  Science  of  the  Hiroshima  University,  Bd.  19 
(1955),  S.  301-306. 

[20]  TITS,  J.,  Etude  de  certains  espaces  metriques.  Bulletin  de  la  Societe  Math6ma- 
tique  de  Belgiquc  (1953),  S.  44-52. 

[21] ,  Sur  un  article  precedent:  Etudes  de  certaines  espaces  metriques.  Bulletin 

de  la  Societe  Math6matique  de  Belgique  (1953),  S.  124-125. 

[22]  TOMINAGA,  Akira.  On  some  properties  of  non-compact  Peano  spaces.  Journal  of 
Science  of  the  Hiroshima  University,  Bd.  19  (1956),  S.  457-467. 


GEOMETRIE    UND    TOPOLOGIE  187 

[23]    URYSOHN,   P.,   Zum  Metrisationsproblem.   Mathematischc  Aimalen,    Bd.    94 

(1925),  S.  309-315. 
[24]     VIICTOKIS,   L,.,    Ober  den  hbheren  Zusammenhang  kompakter  Rdunie  und  eine 

Klasse  von  znsammenhangstreuen  Abbildungen.  Mathematischc  Annalcn,  Bd. 

97  (1927),  S.  454-472. 
[25]     WANG,    H.  C.,   Two-point  homogeneous  spaces.   Annals  of  Mathematics,   Bd. 

55(1952),  S.  177-191. 
[26]     WILSON,   W.   A.,  A   relation  between  metric  and  euclidean  spaces.  American 

Journal  of  Mathematics,  Bd.  54  (1932),  S.  505-517. 


Symposium  on  the  Axiomatic  Method 


LATTICE-THEORETIC  APPROACH  TO  PROJECTIVE 
AND  AFFINE  GEOMETRY 

BJARNI  JONSSON 

University  of  Minnesota,  Minneapolis,  Minnesota,    U.S.A. 

The  results  that  we  arc  going  to  discuss  are  due  to  several  authors.  The 
earliest  work  along  these  lines  was  done  by  Menger  in  the  late  twenties. 
lie  was  joined  a  few  years  later  by  von  Neumann  and  Birkhoff.  A  large 
number  of  more  recent  contributions  can  be  found  in  the  papers  listed  in 
bibliography;  we  shall  in  particular  make  use  of  results  due  to  Frink  and 
Schutzenberger  on  project ive  geometry,  and  by  Croisot,  Maecla,  Sasaki 
and  Wilcox  on  affine  geometry  and  its  generalizations.  The  bibliography 
includes  a  number  of  papers  that  are  not  concerned  directly  with  geome- 
try, but  in  which  at  least  some  of  the  ideas  and  methods  were  suggested 
by  the  investigations  of  geometric  lattices. 

1.  Concepts  from  lattice  theory.  A  lattice  can  be  defined  as  a  partially 
ordered  set  in  which  any  two  elements  have  a  least  upper  bound  and  a 
greatest  lower  bound.  We  shall  use  <  for  the  partially  ordering  relation 
and  write  %  +  y  and  xy  for  the  least  upper  bound,  or  sum,  and  the 
greatest  lower  bound,  or  product,  of  two  elements  x  and  y.  Most  of  our 
lattices  will  be  complete,  i.e.,  an  y  system  of  elements  xi,  i  e  I,  will  have  a 
least  upper  bound  and  a  greatest  lower  bound 

]£  Xi  and  JJ  xt. 

iel  ie/ 

In  any  complete  lattice  there  exist  a  zero  element  0  and  a  unit  element  1 
such  that  0  <  x  <  1  for  every  lattice  element  x.  Even  when  we  consider 
lattices  that  are  not  complete  we  shall  always  assume  that  they  have  a 
zero  element  and  a  unit  element.  A  lattice  is  said  to  be  complemented  if  for 
any  element  x  there  exists  an  element  y  such  that  x  +  y  =  1  and  xy  —  0. 
If,  for  any  elements  a,  b,  and  x  with  a  <  x  <  b  there  exists  an  element  y 
such  that  x  +  y  =  b  and  xy  =^  a,  then  the  lattice  is  said  to  be  relatively 
complemented.  Clearly  every  relatively  complemented  lattice  (with  a  zero 
element  and  a  unit  element)  is  complemented. 

An  element  a  is  said  to  cover  an  element  b  if  b  <  a  and  if  there  exists 

188 


LATTICE    THEORY    AND    GEOMETRY  189 

no  element  x  such  that  b  <  x  <  a.  An  element  that  covers  0  is  called  an 
atom,  and  an  element  covered  by  1  is  called  a  dual  atom.  A  lattice  in  which 
every  element  is  a  sum  of  atoms  is  said  to  be  atomistic.  A  system  of  ele- 
ments %i,  i  e  /,  in  a  complete  lattice  is  said  to  be  independent  if 


teJ         ieK 

whenever  /  and  K  are  disjoint  subsets  of  /. 

We  are  primarily  interested  in  lattices  that  are  not  distributive,  but 
certain  special  cases  of  the  distributive  law  will  play  an  important  role. 
A  complete  lattice  is  said  to  be  continuous  l  if  the  equation 

«  2  x*  =  2  axi 

iel  iel 

holds  whenever  the  set  {xt\i  e  1}  is  directed.  (A  partially  ordered  set  is 
said  to  be  directed  if  any  two  elements  of  the  set  have  an  upper  bound 
that  also  belongs  to  the  set.)  Two  elements  b  and  c  are  said  to  form  a 
modular  pair  —  in  symbols  M(b,  c)  —  if 

(x  +  b)c  =  x  +  be  whenever  x  <  c. 

If  this  holds  for  any  two  elements  b  and  c,  then  the  lattice  is  said  to  be 
modular.  If  the  relation  M  is  symmetric,  i.e.,  if  for  any  two  elements  b 
and  c  the  conditions  M(b,  c)  and  M(c,  b)  are  equivalent,  then  the  lattice  is 
said  to  be  semi-modular.  Finnally,  a  lattice  is  said  to  be  special  if  any  two 
elements  that  are  not  disjoint  form  a  modular  pair,  i.e.,  if  the  condition 
M(b,  c)  holds  whenever  be  /-  0. 

2.  Geometries  and  geometric  lattices.  It  is  convenient  for  our  purpose  to 
take  as  the  undefined  concepts  of  geometry  the  set  consisting  of  all  the 
points  and  the  function  which  associates  with  every  set  of  points  the  sub- 
space  which  it  spans.  Thus  we  introduce  : 

DEFINITION  2.1.  By  a  GEOMETRY  we  mean  an  ordered  pair  <S,  C> 
consisting  of  a  set  S  and  a  function  C  which  associates  with  every  subset  X 
of  S  another  subset  C(X)  of  S  in  such  a  way  that  the  following  conditions  are 

1  Such  lattices  arc  sometimes  called  upper  continuous,  but  since  the  dual  concept 
of  a  lower  continuous  lattice  will  not  be  needed  here,  no  confusion  will  be  caused  by 
the  present  terminology. 


190  BJARNI    JONSSON 

satisfied: 

(i)  X  C  C(X)  -•  C(C(X))  for  every  subset  X  of  S. 

(ii)  C(p)  =  p  for  every  peS* 
(iii)  C(«  =  <£  3 

(iv)  For  every  subset  X  of  S,  C(X)  is  the  union  of  all  sets  of  the  form  C(Y) 
with  Y  a  finite  subset  of  X. 

DEFINITION  2.2.     Suppose  <S,  C>  is  a  geometry. 

(i)  An  element  of  S  is  called  a  POINT  of  <S,  C>. 

(ii)  ^4  set  of  the  form  C(X)  with  X  C  S  is  called  a  SUBSPACE  of  <5,  C>.  // 
Y  —  C(X),  /A^w  Y  is  said  to  be  SPANNED  by  X. 

(iii)  A  subspace  of  <S,  C>  is  s0«2  to  be  W-DIMENSIONAL  if  it  is  spanned  by 
a  set  with  n  +  1  elements  but  is  not  spanned  by  any  set  with  fewer 
than  n  +  1  elements. 

(iv)  By  a  LINE  and  a  PLANE  of  <5,  C>  100  mean,  respectively,  a  one  di- 
mensional and  a  two  dimensional  subspace  of  <S,  C>. 

From  2.1(iv)  it  follows  that  if  X  and  Y  are  subsets  of  5,  and  iiXCY, 
then  C(X)  CC(Y).  Together  with  2.1(i)-(iii)  this  yields: 

THEOREM  2.3.  The  family  <$/  of  all  subspaces  of  a  geometry  (S,  C>  has 
the  following  properties : 

(i)  S  and  </>  are  members  of  ,$# '. 

(ii)  Every  one-element  subset  of  S  is  a  member  of  s# . 
(iii)  The  intersection  of  .any  number  (finite  or  infinite)  of  sets  belonging  to 
,«/  is  a  member  of  ,tV. 

The  tieup  between  geometries  and  lattices  is  now  easily  established.  In 
fact,  if  a  family  efl/ of  subsets  of  a  set  S  has  the  properties  2.3(i)-(iii),  then 
.$/  is  a  complete  and  atomistic  lattice  under  set-inclusion.  The  lattice 
product  of  any  system  Xi,  i  e  /,  of  sets  belonging  to  J/  is  their  set- 
theoretic  intersection,  and  the  lattice  sum  of  the  sets  Xi  is  the  smallest 
member  of  stf  which  contains  their  union.  The  atoms  of  j/  are  the  one- 
element  subsets  of  5.  Conversely,  any  complete  and  atomistic  lattice  A  is 
isomorphic  to  a  family  j/  consisting  of  subsets  of  some  set  5  and  satis- 
fying 2.3(i)-(iii).  In  fact,  we  may  take  for  S  the  set  of  all  atoms  of  A  and 

2  Strictly   speaking   C({p})  ={/>}.    We   shall   also   write    C(p,q),    C(p,q,r),    ..., 
C(X,  p).  C(X,  p,  q],  ...  for  C({p,  q}),  C({p,  q,  r}) C(A'w  {/>}).  C(Xv{p,  q}) 

3  ^  is  the  empty  set. 


LATTICE    THEORY   AND    GEOMETRY  191 

correlate  with  each  element  xoiA  the  set  consisting  of  all  the  atoms  poiA 
for  which  p  <  x.  This  leads  to 

DEFINITION  2.4.  A  lattice  is  said  to  be  GEOMETRIC  if  it  is  isomorphic 
to  the  lattice  of  all  sub  spaces  of  some  geometry. 

Each  of  the  next  two  theorems  gives  an  axiomatic  characterization  of 
geometric  lattices.  The  first  is  an  immediate  consequence  of  the  definitions 
involved. 

THEOREM  2.5.  A  lattice  is  geometric  if  and  only  if  it  is  complete  and 
atomistic,  and  has  the  property  that  for  any  atom  p  and  any  systems  of  atoms 
qi,  i  E  I,  the  condition 


implies  that  there  exists  a  finite  subset  J  of  I  such  that 


THEOREM  2.6.  A  lattice  is  geometric  if  and  only  if  it  is  complete,  atom- 
istic and  continuous. 

3.  The  exchange  property.  Our  notion  of  a  geometry  is  an  extremely 
general  one  and  cannot  be  expected  to  have  many  interesting  conse- 
quences. It  may  for  instance  happen  that  two  distinct  lines  have  more 
than  one  point  in  common,  and  in  fact  it  is  easy  to  construct  geometries 
where  one  line  is  properly  contained  in  another.  We  now  consider  a 
condition  which  excludes  such  pathological  situations. 

DEFINITION  3.1.  A  geometry  <5,  C>  is  said  to  have  the  EXCHANGE 
PROPERTY  if  ,  for  any  points  p  and  q  and  any  subset  X  of  S,  the  conditions 
p  e.  C(X,  q)  and  p  $  C(X)  jointly  imply  that  q  e  C(X,  p). 

DEFINITION  3.2.  By  a  MATROID  LATTICE  we  mean  a  geometric  lattice 
with  the  property  that,  for  any  atoms  p  and  q  and  any  element  x,  the  conditions 
p  <  q  +  x  and  p  .<f.  x  jointly  imply  that  q  <p  +  x. 

THEOREM  3.3.  In  order  for  a  lattice  A  to  be  isomorphic  to  the  lattice  of 
all  subspaces  of  a  geometry  which  has  the  exchange  property  it  is  necessary  and 
sufficient  that  A  be  a  matroid  lattice. 

THEOREM  3.4.     Every  matroid  lattice  is  relatively  complemented. 


192  BJARNI    JONSSON 

Matroid  lattices  have  been  extensively  investigated.  Of  the  numerous 
equivalent  characterizations  of  this  class  of  lattices,  the  one  given  in  the 
next  theorem  is  particularly  interesting. 

THEOREM  3.5.  In  order  for  a  lattice  A  to  be  a  matroid  lattice  it  is  neces- 
sary and  sufficient  that  A  be  complete,  atomistic,  continuous  and  semi- 
modular. 

THEOREM  3.6.  In  any  matroid  lattice  the  following  conditions  hold  for 
all  elements  a,  b,  c,  d,  and  all  atoms  p,  q,  pQ,  p\,  . . .,  pn\ 

(i)  Ifa<a  +  p<a-\-q,  then  a  +  p  =  a  +  q. 
(ii)  //  ap  =  0,  then  a  +  p  covers  a. 
(iii)  //  (a  +  b)p  =  0,  then  (a  +  p}b  =  ab 
(iv)  If  (po  +  pi  +  ...  +  pk-i)pk  =  0  for  k=  I,  2,   ...,  n,  then  the 

system  Pi,  i  =  0,  1 ,  . . . ,  n,  is  independent. 
(v)  //  a  and  b  cover  ab,  then  a  +  b  covers  a  and  b. 
(vi)  //  a  covers  ab,  then  a  +  b  covers  b. 
(vii)  //  b  covers  be,  then  M(b,c). 
(viii)  //  be  <  a  <  c  <  b  +  c,  then  there  exists  an  element  x  such  that 

be  <  x  <  b  and  a  =  (a  +  x)c. 
(ix)  //  be  <  a  <  c  <  b  +  c,  then  there  exists  an  element  x  such  that 

be  <  x  <  b  and  (a  +  x)c  <  c. 

(x)  //  be  <  a  <  c  <  a  +  b,  then  there  exists  an  element  x  such  that 
be  <  x  <b  and  a  —  (a  +  x)c. 

Conversely,  any  geometric  lattice  which  satisfies  one  of  the  conditions 
(i)-(x)  is  a  matroid  lattice. 

4.  Strongly  planar  geometries.  In  the  classical  approach  to  affine 
geometry,  a  set  X  of  points  is  by  definition  a  subspace  if  and  only  if  it 
contains  every  line  with  which  it  has  two  distinct  points  in  common,  and 
contains  every  plane  with  which  it  has  three  non-collinear  points  in  com- 
mon. In  projective  geometry  the  first  of  these  two  conditions  alone  is 
taken  as  the  characteristic  property  of  a  subspace.  Thus  it  is  true  in 
either  case  that  if  C(p,  q,  r)  C  X  whenever  p,  q,  r  e  X,  then  X  is  a  sub- 
space.  Another  property  common  to  the  classical  affine  and  projective 
geometries  is  the  fact  that  two  intersecting  planes  which  are  contained  in 
the  same  3-space  have  a  line  in  common.  This  motivates  the  next  two 
definitions. 


LATTICE    THEORY   AND    GEOMETRY  193 

DEFINITION  4.1.  A  geometry  <5,  C>  is  said  to  be  PLANAR  if  it  has  the 
exchange  property  and,  for  every  subset  X  of  S,  the  condition 

C(p,  q,  r)  C  X  whenever  p,  q,  r  e  X 
implies  that  X  is  a  subspace  of  <S,  C>. 

DEFINITION  4.2.  A  geometry  <5,  C>  is  said  to  be  STRONGLY  PLANAR  if 
it  is  planar  and  has  the  property  that  any  two  distinct  planes  that  are  con- 
tained  in  the  same  3-space  are  either  disjoint  or  else  their  intersection  is  a  line. 

No  simple  condition  is  known  which  characterizes  those  lattices  which 
correspond  to  planar  geometries.  As  regards  strongly  planar  geometries 
we  have: 

THEOREM  4.3.  A  geometry  <S,  C>  is  strongly  planar  if  and  only  if  it  has 
the  exchange  property  and,  for  any  points  p,  q,  r,  and  any  set  X  of  points, 
the  conditions 

peC(X,q),     reC(X) 

jointly  imply  that  there  exists  a  point  s  such  that 

peC(q,r,s)  and  seC(X). 

THEOREM  4.4.  For  any  matroid  lattices  A  the  following  conditions  are 
equivalent : 

(i)    A  is  isomorphic  to  the  lattice  of  all  subspaces  of  a  strongly  planar 

geometry. 
(ii)    For  any  atoms  p,  q,  r  of  A  and  any  element  a  of  A,  the  conditions 

p  <  q  +  a  and  r  <  a 

jointly  imply  that  there  exists  an  atom  s  such  that 
p  <  q  +  r  +  s  and  s  <  a. 

(iii)  A  is  special. 

(iv)  For  any  element  a  of  A  and  any  dual  atom  h  of  A ,  if  0  <  ah  <  a, 
then  a  covers  ah. 

5.  Projective  geometries.  In  order  to  obtain  a  concept  that  corresponds 
more  or  less  to  the  classical  notion  of  a  projective  geometry  we  need 
axioms  to  the  effect  that  any  two  lines  in  the  same  plane  have  a  point  in 
common,  and  that  if  a  set  X  of  points  has  the  property  that  it  contains 
every  line  with  which  it  has  two  points  in  common,  then  X  is  a  subspace. 
These  two  conditions  can  be  stated  as  a  single  axiom : 


194  BJARNI    JONSSON 

DEFINITION  5.1.  A  geometry  <5,  C>  is  said  to  be  PROJECTIVE  if  it  has 
the  exchange  property  and,  for  any  points  p  and  q  and  any  set  X  of  points, 
the  conditions 

p  e  C(X,  q)  and  p  ^  q 

jointly  imply  that  there  exist  a  point  r  such  that 
peC(q,r)  and  r 


DEFINITION  5.2.  A  lattice  is  said  to  be  PROJECTIVE  if  and  only  if  it  is 
isomorphic  to  the  lattice  of  all  subspaces  of  a  protective  geometry. 

COROLLARY  5.3.     Every  projective  geometry  is  strongly  planar. 

COROLLARY  5.4.  Suppose  p  is  a  point  and  X  and  Y  are  sets  of  points 
of  a  protective  geometry  <5,  C>.  // 

p  e  C(X,  Y),  p  $  C(X)  and  p  $  C(Y), 
then  there  exist  points  q  and  r  such  that 

p  e  C(q,  r),  q  e  C(X)  and  r  e  C(Y). 

With  the  aid  of  these  two  corollaries  we  get  a  particularly  elegant 
characterization  of  projective  lattices: 

THEOREM  5.5.  A  lattice  is  projective  if  and  only  if  it  is  complete, 
atomistic,  continuous  and  modular. 

The  notion  of  a  projective  geometry  as  defined  here  is  obviously  more 
general  than  the  classical  concept,  since  we  put  no  restriction  on  the 
dimension  and  do  not  exclude  geometries  in  which  there  are  degenerate 
lines  consisting  of  only  two  distinct  points.  However,  this  generalization 
is  less  radical  than  it  might  appear  at  first  glance.  Every  projective  lattice 
A  is  a  direct  product  of  indecomposable  sublattices, 

A  =U  At. 

i&l 

When  applied  to  the  lattice  of  all  subspaces  of  a  projective  geometry 
<S,  C>,  this  decomposition  corresponds  to  a  partitioning  of  S  into  sub- 
spaces  Si  in  such  a  way  that  two  distinct  points  belong  to  the  same  sub- 
space  if  and  only  if  they  determine  a  non-degenerate  line.  Some  of  these 
components  may  be  trivial,  consisting  of  just  one  point  or  of  just  one  line, 
and  others  may  be  non-Arguesian  planes.  With  these  exceptions,  we  can 
associate  with  each  component  5|  a  division  ring  and  introduce  coordinates 


LATTICE    THEORY    AND    GEOMETRY  195 

in  the  manner  of  classical  geometry,  with  the  sole  difference  that  the  num- 
ber of  coordinates  may  be  infinite. 

This  brings  us  to  the  subject  of  Desargues'  Law: 

DEFINITION  5.6.     A  geometry  <S,  C>  is  said  to  be  ARGUESIAN  if  it  is 
protective  and,  for  any  points  pQ,  pi,  p2,  q$,  q\,  q2,  the  condition 


implies  that 
C(pi,  p2)  "  C(qi,  q2)  C  C((C(p0pi)  o  C(qQ,  qi))  w  (C(p0,  p2)  ^  C(qQ, 

DEFINITION  5.7.  A  lattice  is  said  to  be  ARGUESIAN  if  and  only  if  it  is 
isomorphic  to  the  lattice  of  all  subspaces  of  an  Arguesian  geometry. 

The  formulation  of  Desargues'  Law  in  Definition  5.6  differs  from  the 
classical  version  in  that  no  restriction  is  placed  on  the  six  points  involved 
(such  as  that  they  be  distinct,  or  that  the  three  pairs  PI  ,  qi,  i  =  0,  1,2  lie 
on  three  distinct  but  concurrent  lines).  However,  the  two  formulations 
are  actually  equivalent,  for  some  of  the  special  cases  that  are  normally 
excluded  are  actually  valid  in  all  projective  geometries,  while  the  re- 
maining cases  follow  from  the  classical  Desargues'  Law. 

It  is  of  course  easy  to  write  down  a  lattice-theoretic  version  of  Desar- 
gues' Law,  involving  six  atoms  (5.8(ii)).  It  is  an  interesting  fact  that  this 
condition  actually  holds  with  the  six  atoms  replaced  by  any  six  lattice 
elements  (5.8(iii)).  Perhaps  more  important,  however,  is  the  fact  that  this 
condition  is  actually  equivalent  to  a  lattice  identity  (5.8(iv)). 

THEOREM  5.8.  //  A  is  a  geometric  lattice,  then  the  following  conditions 
are  equivalent: 

(i)  A  is  Arguesian 

(ii)  A  is  modular  and,  for  any  atoms  ,po,  pi,  p2,  qo,  qi,  q2  of  A,  the  con- 
dition 

(Pi  +  qi)(Pz  +  £2)  <  po  +  qo 
implies  that 

(pi  +  pz)(qi  +  qz)  <  (pQ  +  pi)(qv  +  ?i)  +  (pQ  +  pz)(qQ  +  qz) 
(iii)  For  any  elements  ao,  a\,  a2,  bo,  bi,  b2eA,  the  condition 

(ai  +  bi)(a2  +  b2)  <  aQ  +  bQ 
implies  that 

(ai  +  a2)(bi  +  b2)  ^  (a0  +  <*i)(&o  +  &i)  +  (*o  +  a2)(60  +  b2). 


196  BJARNI    JONSSON 


(iv)  For  any  elements  ao,  a\,  a^  &o,  &i,  b%G  At  if 
+b2)[(aQ  +  ai)(ft0  +  61)  +  (a0 


4-  62)  <  «o(«i  +  y)  +  M*i  4-  y)> 

Observe  that  in  (iii)  and  (iv)  we  do  not  assume  the  modular  law;  it 
turns  out  to  be  a  consequence  of  the  given  conditions.  4  In  terms  of  the 
decomposition  discussed  above,  a  projective  lattice  A  is  Arguesian  if  and 
only  if  none  of  its  indecomposable  factors  is  isomorphic  to  the  lattice  of 
all  subspaces  of  a  non-  Arguesian  projective  plane. 

6.  Affine  geometries.  We  define  an  affine  geometry  to  be  a  strongly 
planar  geometry  in  which  Euclid's  parallel  axiom  holds  : 

DEFINITION  6.1.  A  geometry  <5,  C>  is  said  to  be  AFFINE  if  and  only  if 
<S,  C>  is  sir  only  planar  and  has  the  following  property:  For  any  plane  P, 
line  L,  and  point  p,  the  conditions 

p  e  P,  L  C  P  and  p  $  L 

jointly  imply  that  there  exists  a  unique  line  L'  such  that 
pzL'.L'CP  and  L  n  L'  =  <f>. 

DEFINITION  6.2.  A  lattice  is  said  to  be  AFFINE  if  and  only  if  it  is  iso- 
morphic to  the  lattice  of  all  subspaces  of  an  affine  geometry. 

THEOREM  6.3.  A  lattice  A  is  affine  if  and  only  if  it  is  a  special  matroid 
lattice  with  the  following  property  :  For  any  atoms  p,  q,  and  r,  if 

p  <p  +  q  <p  +  q  +  r, 
then  there  exists  a  unique  element  x  such  that 

r<x<p  +  q  +  r  and  (p  +  q)x  =  0 

The  relation  between  our  concepts  of  an  affine  geometry  and  of  a  non- 
degenerate  projective  geometry  is  precisely  analogous  to  the  relation 
between  their  classical  counterparts  : 

4  In  fact,  in  any  lattice  A,  (iv)  implies  (iii)  and  (iii)  in  turn  implies  that  A  is 
modular. 


LATTICE   THEORY   AND    GEOMETRY  197 

THEOREM  6.4.  //  p  is  an  atom  of  an  a/fine  lattice  A ,  then  the  set  of  all 
elements  x  £  A  with  p  <  x  is  an  indecomposable  projective  lattice  under  the 
partially  ordering  relation  defined  on  A . 

THEOREM  6.5.  //  h  is  a  dual  atom  of  an  indecomposable  projective  lattice 
A ,  then  the  set  A  h  consisting  of  0  and  of  all  elements  x  £  A  with  x  i  h  is 
an  a/fine  lattice  under  the  partially  ordering  relation  on  A.  Conversely,  for 
any  a/fine  lattice  B  there  exist  an  indecomposable  projective  lattice  A  and 
a  dual  atom  h  of  A  such  that  13  is  isomorphic  to  A^. 

7.  Applications  of  geometry  to  lattice  theory.  As  can  be  seen  from  the 
above  discussion,  the  applications  of  lattice  theory  to  the  axiomatization 
of  geometry  have  yielded  radically  different  and  quite  simple  character- 
izations of  the  geometries  considered.  Similar  work  has  been  done  with 
other  types  of  geometries,  and  it  is  quite  certain  that  more  can  be  clone 
along  these  lines. 

But  these  investigations  have  also  aided,  both  directly  and  indirectly, 
in  the  study  of  certain  problems  in  lattice  theory.  We  shall  mention 
briefly  some  examples  that  illustrate  this  point. 

Modular  lattices  may  be  regarded  as  a  generalization  of  projective 
geometry.  Since  every  projective  lattice  is  complemented,  it  might  be 
more  reasonable  to  consider  only  complemented  modular  lattices.  Just 
how  far-reaching  a  generalization  is  this?  A  partial  answer  was  provided 
by  von  Neumann,  who  showed  that  every  complemented  modular  lattice 
which  satisfies  certain  conditions  (namely,  possesses  an  w-frame  with 
n  >  4)  is  isomorphic  to  the  lattice  of  all  principal  left  ideals  of  a  regular 
ring.  Since  a  full  matrix  ring  over  a  division  ring  is  regular,  this  may  be 
regarded  as  a  generalization  of  the  coordinatization  theorem  for  non- 
Arguesian  geometries.  There  is  however  an  important  problem  open  here : 
To  find  a  condition  that  is  both  necessary  and  sufficient  in  order  for  a 
complemented  modular  lattice  to  be  isomorphic  to  the  lattice  of  all 
principal  left  ideals  of  a  regular  ring. 

A  representation  of  a  different  kind  was  obtained  by  Frink,  who  proved 
that  every  complemented  modular  lattice  B  is  a  sublattice  of  a  projective 
lattice  A .  The  Frink  geometry  associated  with  B  is  a  generalization  of  the 
Stone  space  of  a  Boolean  algebra;  its  points  are  the  maximal  proper  dual 
ideals  of  B,  and  the  line  through  two  points  P  and  Q  consists  of  all  points 
R  such  that  P  r\  Q  C  R.  The  subspace  correlated  with  a  given  element 
x  e  B  is  the  set  of  all  points  P  such  that  x  e  P.  It  is  known  that  this 


198  BJARNI   JONSSON 

embedding  preserves  all  identities  that  hold  in  B,  and  from  this  it  follows 
in  particular  that  A  is  Arguesian  if  and  only  if  B  satisfies  the  condition 
(iv)  of  Theorem  5.8. 

Still  other  investigations,  by  Bear  and  Inaba,  that  were  inspired  by  the 
coordinatization  theorem,  concern  the  lattice  of  all  submodules  of  a 
module  over  a  ring  of  a  certain  type  and,  as  a  special  case,  the  lattices  of 
all  subgroups  of  a  finite  Abelian  group.  The  principal  result  may  be 
regarded  as  a  representation  theorem  for  a  certain  class  of  modular 
lattices,  including  all  the  finite  dimensional  projective  lattices. 

Even  for  modular  lattices  that  are  not  complemented,  Desargues'  Law 
in  the  form  of  the  condition  (iv)  of  Theorem  5.8  turns  out  to  be  significant. 
Most  of  the  modular  lattices  that  arise  in  applications  of  lattice  theory  are 
isomorphic  to  lattices  of  commuting  equivalence  relations,  and  in  fact 
all  the  known  examples  for  which  this  is  not  the  case  are  of  a  somewhat 
pathological  character.  It  is  therefore  natural  to  try  to  characterize 
axiomatically  the  class  of  all  those  lattices  for  which  such  a  representation 
exists.  It  is  not  hard  to  prove  that  Desargues'  Law  is  a  necessary  con- 
dition, but  it  is  still  an  open  question  whether  this  is  also  sufficient.  On  the 
other  hand,  an  infinite  system  of  axioms  (in  the  form  of  conditional 
equations)  is  known,  which  is  sufficient  as  well  as  necessary,  and  these 
axioms  are  such  that  when  applied  to  the  lattice  of  all  subspaces  of  a 
projective  geometry,  they  reduce  to  certain  configuration  theorems  which 
are  valid  in  all  Arguesian  geometries. 

The  family  of  all  equivalence  relations  over  a  set  U,  or  equivalently  the 
family  of  all  partitions  of  U,  is  a  geometric  lattice.  The  class  consisting  of 
all  lattices  of  this  form  (and  of  their  isomorphic  images)  can  be  convenient- 
ly characterized  by  describing  the  corresponding  geometries.  In  fact,  in 
order  for  the  lattice  of  all  subspaces  of  a  geometry  <S,  C>  to  be  isomorphic 
to  the  lattice  of  all  equivalence  relations  over  some  set,  it  is  necessary  and 
sufficient  that  the  following  conditions  be  satisfied: 

(1)  <5,  C>  is  planar  and  has  the  exchange  property. 

(2)  Each  plane  of  <S,  C>  has  either  3,  4,  or  6  points. 

(3)  For  each  line  L  of  <5,  C>,  either  L  has  exactly  two  points,  and  there 
are  exactly  two  lines  parallel  to  L,  or  else  L  has  exactly  three  points  and 
there  is  no  line  parallel  to  L. 

In  Theorem  6.5,  affine  lattices  are  characterized  as  those  lattices  which 
can  be  obtained  from  indecomposable  projective  lattices  by  removing  all  the 
elements  x  contained  in  some  fixed  dual  atom  h,  with  the  exception  of  the 
zero  element.  If  h  is  not  a  dual  atom,  this  process  still  leads  to  a  special 


LATTICE   THEORY   AND    GEOMETRY  199 

matroid  lattice,  but  only  half  of  Euclid's  parallel  axiom  will  be  satisfied 
(the  uniqueness  part).  The  question  of  what  lattices  can  be  obtained  from 
complemented  modular  lattices  by  removing  more  general  sets,  subject  to 
some  suitable  conditions,  has  been  studied  by  Wilcox.  Some  of  his  results 
have  been  announced  in  abstracts,  but  a  detailed  account  has  not  yet 
appeared. 

These  examples  will  suffice  to  illustrate  the  fact  that  the  investigations 
of  the  connections  between  geometries  and  lattices  have  yielded  something 
of  interest  to  both  subjects. 


Bibliography 

AMEMIYA,  I.  On  the  representation  of  complemented  modular  lattices.  Journal  of  the 
Mathematical  Society  of  Japan,  vol.  9  (1957),  pp.  263-279. 

BAER,  R.,  A  unified  theory  of  projective  spaces  and  finite  abelian  groups.  Transactions 
of  the  American  Mathematical  Society,  vol.  52  (1942),  pp.  283-343. 

,  Linear  algebra  and  projective  geometry.  New  York,  1952,  VIII  -(-318  pp. 

BIRKHOFK,  G.,  Abstract  linear  dependence  and  lattices.  American  Journal  of  Mathe- 
matics, vol.  57  (1935),  pp.  800-804. 

,  Combinatoyy  relations  in  projective  geometry.  Annals  of  Mathematics  (2),  vol. 

36  (1935),  pp.  743-748. 

,  Lattice  theory.  New  York  1948,  XIII  +  283  pp. 

,  Metric  foundations  of  geometry  T.  Transactions  of  the  American  Mathematical 

Society,  vol.  55  (1944),  pp.  465-492. 

,  and  FKINK,  O.,  Representations  of  lattices  by  sets.  Transactions  of  the  American 

Mathematical  Society,  vol.  64  (1948),  pp.  299-316. 

and  VON  NEUMANN,  J.,  The  logic  of  quantum  mechanics.  Annals  of  Mathematics 

(2),  vol.  37  (1936),  pp.  823-843. 

CROISOT,  R.,  Axiomatique  des  treillis  semi-modulaires .  Comptes  Rendus  Hebdoma- 
daires  des  Seances  clc  I'Acad&nie  des  Sciences  (Paris),  vol.  231  (1950),  pp.  12— 
14. 

,  Contribution  a  I'etude  des  treillis  semi-modulaires  de  longueur  infinie.  Annales 

Scieiitifiques  de  1'Ecole  Normale  Supcrieure  (3),  vol.  68  (1951),  pp.  203-265. 

,  Diverse  caracterisations  des  treillis  semi-modulaires,  modulaires  et  distributifs. 

Comptes  Rendus  Hebdomadaires  des  Stances  de  l'Acad6mie  des  Sciences 
(Paris),  vol.  231  (1950),  pp.  1399-1401. 

,  Quelques  applications  et  proprietes  des  treillis  semi-modulaires  de  longueur  in- 
finie. Annales  de  la  Facultd  des  Sciences  de  l'Universit6  de  Toulouse  pour  les 
Sciences  Math6matiques  et  les  Sciences  Physiques  (4),  vol.  16  (1952)  pp.  1 1-74. 

,  Sous-treillis,  produit  cardinaux  et  treillis  homomorphes  des  treillis  semi-modu- 


200  BJARNI   JONSSON 

laires.  Comptes  Rendus  Hebdomadaires  des  Stances  de  l'Acad6mie  des  Sciences 

(Paris),  vol.  232  (1951),  pp.  27-29. 
DILWORTH,  R.  P.,  Dependence  relations  in  a  semi-modular  lattice.  Duke  Mathematical 

Journal,  vol.  11  (1944),  pp.  575-587. 
,    Ideals   in   Birkhoff  lattices.   Transactions   of   the   American   Mathematical 

Society,  vol.  49  (1941),  pp.  325-353. 
,  Note  on  complemented  modular  lattices.  Bulletin  of  the  American  Mathematical 

Society,  vol.  46  (1940),  pp.  74-76. 
,  The  arithmetic  theory  of  Birkhoff  lattices.  Duke  Mathematical  Journal,  vol.  8 

(1941),  pp.  286-299. 
DUBREIL-JACOTIN,  M.  L.,  LESiEUR,  L.  and  CROISOT,  R.,  Lemons  SUY  la  theorie  des 

treillis  des  structures  algebriques  ordonnees  et  des  treillis  geometriques.  Paris  1953, 

VIII    f  385  pp. 
FICKEN,  F.  A.,  Cones  and  vector  spaces.  American  Mathematical  Monthly,  vol.  47 

(1940),  pp.  530-533. 
FRINK,    ().,    Jr.,    Complemented  modular  lattices   and  projective   spaces   of  infinite 

dimension.  Transactions  of  the  American  Mathematical  Society,  vol.  60  (1946), 

pp.  452-467. 
FRYER,  K.  D.  and  HALPEKIN,  I.,  Coordinates  in  geometry.  Transactions  of  the  Royal 

Society  of  Canada,  vol.  48  (1954),  pp.  11-26. 
,   and  ,   On  the  coordinatization  theorem  of  J.   von  Neumann.  Canadian 

Journal  of  Mathematics,  vol.  7  (1955),  pp.  432-444. 
and ,  The  von  Neumann  coordinatization  theorem  for  complemented  modular 

lattices.  Acta  Universitates  Szcgediensis.  Acta  Scientiarum  Mathematicarum, 

vol.  16  (1956),  pp.  203-249. 
HALL,  M.  and  DILWORTH,  R.  P.,  The  imbedding  problem  for  modular  lattices. 

Annals  of  Mathematics  (2),  vol.  45  (1944),  pp.  450-456. 
HALPERIN,  I.,  Addivity  and  continuity  of  perspectivity.  Duke  Mathematical  Journal, 

vol.  5  (1939),  pp.  503-511. 
,  Dimensionality  in  reducible  geometries.  Annals  of  Mathematics  (2),  vol.  40 

(1939),  pp.  581-599. 
,  On  the  transitivity  of  perspectivity  in  continuous  geometries.  Transactions  of 

the  American  Mathematical  Society,  vol.  44  (1938),  pp.  537-562. 
Hsu,  C.,  On  lattice  theoretic  characterization  of  the  parallelism  in  a/fine  geometry. 

Annals  of  Mathematics  (2)  vol.  50  (1949),  pp.  1-7. 

IN  ABA,  E.,  On  primary  lattices.  Journal  of  the  Faculty  of  Science,  Hokkaido  Uni- 
versity, vol.  11  (1948),  pp.  39-107. 
,  Some  remarks  on  primary  lattices.  Natural  Science  Report  of  the  Ochanomizu 

University,  vol.  2  (1951),  pp.  1-5. 
IWAMURA,  T.,  On  continuous  geometries.  I.  Japanese  Journal  of  Mathematics,  vol. 

19  (1944),  pp.  57-71. 
,  On  continuous  geometries.  II.  Journal  of  the  Mathematical  Society  of  Japan. 

vol.  2  (1950),  pp.  148-164. 
IZUMI,  S.,  Lattice  theoretic  foundation  of  circle  geometry.  Proceedings  of  the  Imperial 

Academy  (Tokyo),  vol.  16  (1940),  pp.  515-517. 
J6NSSON,  B.,  Modular  lattices  and  Desargues'  theorem.  Mathematica  Scandinavica, 

vol.  2  (1954),  pp.  295-314. 


LATTICE    THEORY   AND    GEOMETRY  201 

,  On  the  representation  of  lattices.  Mathematica  Scandinavica,  vol.  1  (1953),  pp. 

193-206. 
KAPLANSKY,   I.,  Any  orthocomplemented  complete  modular  lattice  is  a  continuous 

geometry.  Annals  of  Mathematics  (2),  vol.  61  (1955),  pp.  524-541. 
KODAIRA,  K.  and  HURUYA,  S.,  On  continuous  geometries  I,  II,  III  (in  Japanese). 

Zenkoku  Shijo  Sukaku  Panwakai,  vol.  168  (1938),  pp.  514-531 ;  vol.  169  (1938), 

pp.  593-609;  vol.   170  (1938),  pp.  638-656. 
KOTHE,  G.,  Die  Theorie  der  Verbtinde,  ein  neuer  Versuch  zuv  Grundlegung  der  Algebra 

und  der  projectiven  Geometric.  Jahresbericht  der  Deulschcn  Mathcmatiker  Ver- 

cinigung,  vol.  47  (1937),  ])p.  125-144. 
KRISHNAN,  V.  S.,  Partially  ordered  sets  and  protective  geometry.  The  Mathematics 

Student,  vol.  12  (1944),  pp.  7-  14. 
LOOMIS,  L.  H.,  The  lattice  theoretic  background  of  the  dimension  theory  of  operator 

algebras.  Memoirs  oi  the  American  Mathematical  Society  1955,  No.  18,  36  pp. 
MACLANE,   S.,  A   lattice  formulation  for  transcendence  degrees  and  p-bases.   Duke 

Mathematical  Journal,  vol.  4  (1938),  pp.  455-468. 
MAKDA,    ¥.,   A    lattice  formulation  for  algebraic   and  transcendental  extensions   in 

abstract  algebras.   Journal  of  Science  of  the   Hiroshima  University,    vol.    16 

(1952-1953),  pp.  383-397. 

-    — ,  Continuous  geometry  (in  Japanese).  Tokyo  1952,  2  -|-  3    |    225pp. 
,  Dimension  functions  on  certain  general  lattices.  Journal  of  Science  of  the  Hiro- 
shima University,  vol.  19  (1955),  pp.  211-237. 
,  Dimension  lattice  of  reducible  geometries.  Journal  of  Science  of  the  Hiroshima 

University,  vol.  13  (1944),  pp.  11-40. 
,  Direct  sums  and  normal  ideals  of  lattices.  Journal  of  Science  of  the  Hiroshima 

University,  vol.  14  (1949-1950),  pp.  85-92. 

—  -    -,   Kmbedding  theorem  of  continuous  regular  rings.   Journal  of  Science  of  the 

Hiroshima  University,  vol.  14  (1949-1950),  pp.   1-7. 

-,  Lattice  theoretic  characterization  of  abstract  geometries.  Journal  of  Science  of  the 
Hiroshima  University,  vol.  15  (1951-1952),  pp.  87-96. 

— ,  Matroid  lattices  of  infinite  length.  Journal  of  Science  of  the  Hiroshima  Uni- 
versity, vol.  15  (1951-1952),  pp.  177-182. 
— ,  Representations  of  orthocomplemented  modular  lattices.  Journal  of  Science  of  the 

Hiroshima  University,  vol.  14  (1949-1950),  pp.  93-96. 
— ,    The  center  of  lattices.  (In    Japanese).  Journal   of  Science   of  the   Hiroshima 

University,  vol.  12  (1942),  pp.  11-15. 

MENGER,  K.,  Algebra  der  Geometrie  (Zur  Axiomatik  der  projectiven  Verkniipfungs- 
beziehungen] .  Ergebnisse  eincs  mathematischen  Kolloquiums,  vol.  7  (1936),  pp. 
11-12. 

—  — ,  Axiomatique  simplifiee  de  I'algebre  de  la  geometric  protective.  Comptes  Rendus 

Hebdomadaires  des  Seances  de  1' Academic  des  Sciences  (Paris),  vol.  206  (1938), 
pp.  308-310. 

— ,  Bemerkungen  zu  Grundlagenfragen  IV.  Axiomatik  der  endlichen  Mengen  und 
der  elementargeometrischen  Verknupfungsbeziehungen.  Jahresbericht  der  Deut- 
schen  Mathcmatiker  Vcreinigung,  vol.  37  (1928),  pp.  309-325. 

,  La  geome'trie  axiomatique  de  I'espace  projectif,  Comptes  Rendus  Hebdomadaires 

des  Seances  de  1'Academie  des  Sciences  (Paris),  vol.  228  (1949),  pp.  1273-1274. 


202  BJARNI    JONSSON 


,  New  foundations  of  protective  and  affine  geometry.  Algebra  of  geometry.  Annals 

of  Mathematics  (2),  vol.  37  (1936),  pp.  456-482. 
,  Non-Euclidean  geometry  of  joining  and  intersecting.  Bulletin  of  the  American 

Mathematical  Society,  vol.  44  (1938),  pp.  821-824. 
,  On  algebra  of  geometry  and  recent  progess  in  non-liuclidean  geometry.  The  Rice 

Institute  Pamphlets,  vol.  27  (1940),  pp.  41-79. 
,  Selfdual  postulates  in  pyojective  geometry.  American  Mathematical  Monthly. 

vol.  55  (1948),  p.  195. 
MOUSINHO,  M.  L.,  Modular  and  protective  lattices.  Summa  Brasiliensis  Mathemati- 

cac,  vol.  2  (1950),  pp.  95-112. 
VON  NEUMANN,  J.,  Algebraic  theory  of  continuous  geometries.  Proceedings  of  the 

National  Academy  of  Science,  U.S.A.,  vol.  23  (1937),  pp.  16-22. 
,  Continuous  geometry.  Proceedings  of  the  National  Academy  of  Science,  U.S.A., 

vol.  22  (1936),  pp.  92-100. 
,  Continuous  rings  and  their  arithmetics.  Proceedings  of  the  National  Academy 

of  Science,  U.S.A.,  vol.  23  (1937),  pp.  341-349.  Errata,  ibid.,  p.  593. 
,  Examples  of  continuous  geometries,  Proceedings  of  the  National  Academy  of 

Science,  U.S.A.  vol.  22  (1936),  pp.  101-108. 
,  Lectures  on  continuous  geometries,  I— III,  Princeton  1936-1937.  (Mimeographed 

lecture  notes.) 
,  On  regular  rings.  Proceedings  of  the  National  Academy  of  Science,  U.S.A.  vol. 

22  (1936),  pp.  707-712. 

,  and  HALPERIN,  J.,  On  the  transivity  of  perspective  mappings.  Annals  of  Mathe- 
matics (2),  vol.  41  (1940),  pp.  87-93. 
PRENOWITZ,  WALTER,  Total  lattices  of  convex  sets  and  of  linear  spaces.  Annals  of 

Mathematics  (2),  vol.  49  (1948),  pp.  659-688. 
SASAKI,   U.,   Lattice  theoretical  characterization  of  an  affine  geometry  of  arbitrary 

dimension.  Journal  of  Science  of  the  Hiroshima  University,  vol.  16  (1952—1953), 

pp.  223-238. 
,  Lattice  theoretic  characterization  of  geometries  satisfying  "Axiome  der   Ver- 

kniipfung".  Journal  of  Science  of  the  Hiroshima  University,  vol.    16  (1952- 

1953),  pp.  417-423. 
,  Orthocomplemented  lattices  satisfying  the  exchange  axiom.  Journal  of  Science  of 

the  Hiroshima  University,  vol.  17  (1953-1954),  pp.  293-302. 
,   Semi-modularity  in  relatively  atomic,  upper  continuous  lattices.   Journal  of 

Science  of  the  Hiroshima  University,  vol.  16  (1952-1953),  pp.  409-416. 
,  and  FUJIWARA,  S.,  The  characterization  of  partition  lattices.  Journal  of  Science 

of  the  Hiroshima  University,  vol.  15  (1951-1952),  pp.  189-201. 
and  ,  The  decomposition  of  matroid  lattices.  Journal  of  Sciences  of  the 

Hiroshima  University,  vol.  15  (1951-1952),  pp.  183-188. 
SCHUTZENBERGER,  M.,  Sur  certains  axiomes  de  la  thdorie  des  structures.  Comptes 

Rendus  Hebdomadaire  dcs  S6ances  de  I'Acad&nic  des  Sciences  (Paris),  vol. 

221  (1945),  pp.  218-220. 
WHITNEY,  H.,  On  the  abstract  properties  of  linear  dependence.  American  Journal  of 

Mathematics,  vol.  57  (1935),  pp.  509-533. 
WILCOX,  L.  R.,  An  imbedding  theorem  for  semi-modular  lattices.  Bulletin  of  the 

American  Mathematical  Society,  vol.  60  (1954),  p.  532. 


LATTICE   THEORY   AND    GEOMETRY  203 

— ,  Modular  extensions  of  semi-modular  lattices.  Bulletin  of  the  American  Mathe- 
matical Society,  vol.  61  (1955),  pp.  524-525. 

-,  Modularity  in  Birkhoff  lattices.  Bulletin  of  the  American  Mathematical 
Society,  vol.  50  (1944),  pp.  135-138. 

-,  Modularity  in  the  theory  of  lattices.  Annals  of  Mathematics  (2),  vol.  40  (1939), 
pp.  490-505. 


Symposium  on  the  Axiomatic  Method 


CONVENTIONALISM  IN  GEOMETRY  * 

ADOLF  GRONBAUM 

Lehigh  University,  Bethlehem,  Pennsylvania,  U.S.A. 

1 .  Introduction.  In  what  sense  and  to  what  extent  can  the  ascription 
of  a  particular  metric  geometry  to  physical  space  be  held  to  have  an  em- 
pirical warrant  ?  To  answer  this  question  we  must  inquire  whether  and 
how  empirical  facts  function  restrictively  so  as  to  support  a  unique  metric 
geometry  as  the  true  description  of  physical  space. 

The  inquiry  is  prompted  by  the  conflict  of  ideas  on  this  issue  emerging 
in  the  Albert  Einstein  volume  in  Schilpp's  Library  of  Living  Philosophers 
between  Robertson,  Rcichenbach  and  Einstein.  Robertson  characterizes 
K.  SchwarzschilcTs  attempt  to  determine  observationally  the  Gaussian 
curvature  of  an  astronomical  2-flat  as  an  inspiring  implementation  of  the 
empiricist  conception  of  physical  geometry.  And  Robertson  deems 
Schwarzschild's  view  to  be  "in  refreshing  contrast  to  the  pontifical 
pronouncement  of  Henri  Poincare,"  [25,  p.  325J  who  had  declared  that 
"Euclidean  geometry  has,  .  .  .,  nothing  to  fear  from  fresh  experiments" 
[20,  p.  81]  after  reviewing  the  various  possible  results  of  stellar  parallax 
measurements.  In  the  same  volume  [21,  p.  297]  and  elsewhere  [22,  Ch.  8; 
23,  pp.  30-37J,  Reichenbach  maintains,  as  Carnap  had  done  in  his  early 
monograph  Der  Raum  [3],  that  the  question  as  to  which  metric  geometry 
prevails  in  physical  space  is  indeed  empirical  but  subject  to  an  important 
proviso:  it  becomes  empirical  only  after  a  physical  definition  of  con- 
gruence for  line  segments  has  been  given  conventionally  by  stipulating  (to 
within  a  constant  factor  depending  on  the  choice  of  unit)  what  length  is  to 
be  assigned  to  a  transported  solid  rod  in  different  positions  of  space. 
Reichenbach  calls  this  qualified  empiricist  conception  "the  relativity  of 
geometry"  and  terms  "conventionalism"  the  more  radical  thesis  that 
even  after  the  physical  meaning  of  "congruent"  has  been  fixed,  it  is 
entirely  a  matter  of  convention  which  physical  geometry  is  said  to  prevail. 
Believing  Poincare  to  have  been  an  exponent  of  conventionalism  in  this 
sense,  Reichenbach  rejects  Poincare"  s  supposed  philosophy  of  geometry  as 

*  The  author  is  indebted  to  the  National  Science  Foundation  of  the  U.S.A.  for 
the  support  of  research. 

204 


CONVENTIONALISM  IN  GEOMETRY  205 

erroneous.  On  the  other  hand,  Einstein  criticizes  Reichenbach's  relativity 
of  geometry  by  upholding  a  particular  version  of  conventionalism  which 
he  attributes  to  Poincare  [9,  pp.  676-679]. 

This  exchange  reveals  that  there  are  several  different  theses  con- 
cerning the  presence  of  stipulational  ingredients  in  physical  geometry  and 
the  warrant  for  their  introduction  which  require  critical  examination  in 
the  course  of  our  inquiry. 

Our  main  concern  is  with  the  respective  roles  of  convention  and  fact  in 
the  ascription  of  a  particular  metric  geometry  to  physical  space  on  the 
basis  of  measurements  with  a  rigid  body.  Accordingly,  we  shall  discuss 
in  turn  the  two  principal  problems  which  have  been  posed  in  connection 
with  the  formulation  of  the  criterion  of  rigidity  and  of  isochronism. 

2.  The  Criterion  of  Rigidity:   I.  The  Status  of  Spatial  Congruence. 

Differential  geometry  allows  us  to  metrize  a  given  physical  surface, 
say  an  infinite  blackboard  or  some  portion  of  it,  in  various  ways  so  as 
to  acquire  any  metric  geometry  compatible  with  its  topology.  Thus, 
if  we  have  such  a  space  and  a  net-work  of  Cartesian  coordinates  on  it, 
we  can  just  as  legitimately  metrize  the  portion  above  the  #-axis  by  means 

dx2  -f  dy2 
of  the  metric  ds2  = ,  which  confers  a  hyperbolic  geometry  on 

y2 

that  space,  as  by  the  Euclidean  metric  ds2  =  dx2  +  dy2.  The  geometer  is 
not  disconcerted  by  the  fact  that  in  the  former  metrization,  the  lengths  of 
horizontal  segments  whose  termini  have  the  same  coordinate  differences 

dx 
dx  will  be  ds  == and  will  thus  depend  on  where  they  are  along  the 

y 

y-axis.  What  is  his  sanction  for  preserving  equanimity  in  the  face  of  the 
fact  that  this  metrization  commits  him  to  regard  a  segment  for  which 
dx  =  2  at  y  =  2  as  congruent  to  a  segment  for  which  dx  =  1  at  y  —  1, 
although  the  customary  metrization  would  regard  the  length  ratio  of 
these  segments  to  be  2  :  1  ?  His  answer  would  be  that  unless  one  of  two 
segments  is  a  subset  of  the  other  the  congruence  of  two  segments  is  a 
matter  of  convention,  stipulation  or  definition  and  not  a  factual  matter 
concerning  which  empirical  findings  could  show  one  to  have  been 
mistaken.  He  does  not  say,  of  course,  that  a  transported  solid  rod  will 
coincide  successively  with  the  two  hyperbolically-congrucnt  segments 
but  allows  for  this  non-coincidence  by  making  the  length  of  the  transport- 
ed rod  a  suitable  function  of  its  position  rather  than  a  constant.  And  in 
this  way,  he  justifies  his  claim  that  the  hyperbolic  metrization  possesses 


206  ADOLF   GRUNBAUM 

both  epistemological  and  mathematical  credentials  as  good  as  those  of  the 
Euclidean  one. 

This  conception  of  congruence  was  vigorously  contested  by  Bertrand 
Russell  and  defended  by  Poincare  in  a  controversy  which  grew  out  of  the 
publication  of  Russell's  Foundations  of  Geometry  [28].  Our  first  concern 
will  be  with  the  central  issue  of  that  debate. 

Russell  states  the  f actualist's  argument  as  follows  [26,  pp.  687-688]  l : 
"It  seems  to  be  believed  that  since  measurement  is  necessary  to 
discover  equality  or  inequality,  these  cannot  exist  without  measure- 
ment. Now  the  proper  conclusion  is  exactly  the  opposite.  Whatever 
one  can  discover  by  means  of  an  operation  must  exist  independently 
of  that  operation :  America  existed  before  Christopher  Columbus,  and 
two  quantities  of  the  same  kind  must  be  equal  or  unequal  before 
being  measured.  Any  method  of  measurement  is  good  or  bad  accord- 
ing as  it  yields  a  result  which  is  true  or  false.  Mr.  Poincare,  on  the 
other  hand,  holds  that  measurement  creates  equality  and  inequality. 
It  follows  [then]  . .  .  that  there  is  nothing  left  to  measure  and  that 
equality  and  inequality  are  terms  devoid  of  meaning." 
Before  setting  forth  the  grounds  for  regarding  Russell's  argument  here 
as  untenable,  it  will  be  useful  to  analyze  the  reasoning  employed  in  an 
inadequate  criticism  of  it.  This  analysis  will  exhibit  an  important  facet  of 
the  relation  of  the  axiomatic  method  in  pure  geometry  to  the  description 
of  physical  space. 

We  are  told  that  Russell's  contention  can  be  dismissed  by  simply 
pointing  to  the  theory  of  models:  since  physical  geometry  is  a  semanti- 
cally-interpreted  abstract  calculus,  the  customary  physical  interpretation 
of  the  abstract  relation  term  "congruent"  (for  line  segments)  as  opposed 
to  the  kind  of  interpretation  given  in  our  hyperbolic  metrication  above 
clearly  cannot  itself  be  a  factual  statement.  Hence  it  is  argued  that  the 
alternative  metrizability  of  spatial  and  temporal  continua  should  never 
have  been  either  startling  or  a  matter  for  dispute.  On  this  view,  Poincare 
could  have  spared  himself  the  trouble  of  polemicizing  against  Russell  on 
behalf  of  it  in  the  form  of  a  philosophical  doctrine  of  congruence.  For,  so 
the  argument  runs  [7,  pp.  9-10J,  there  can  be  nothing  particularly  problem- 
atic about  the  physical  interpretation  of  the  term  "congruent":  like  the 
physical  meaning  of  all  other  primitives  of  the  calculus,  the  denotata  of 
the  abstract  relation  term  "congruent"  (for  line  segments)  are  specified  by 

1  An  implicit  endorsement  of  this  argument  is  given  by  H.  von  Helmholtz  [33, 
p.  15]. 


CONVENTIONALISM  IN  GEOMETRY  207 

semantical  rules  which  are  fully  on  a  par  in  regard  to  both  conventionality 
and  importance  with  those  furnishing  the  interpretation  of  any  of  the 
other  abstract  primitives  of  the  calculus.  In  fact,  Tarski's  axioms  for 
elementary  Euclidean  geometry,  which  appear  in  this  volume,  even 
dispense  with  the  primitive  "congruent"  for  line  segments  and  yet  yield 
(the  elementary  form  of)  a  metric  geometry  by  using  instead  a  quaternary 
predicate  9  denoting  the  equidistance  relation  between  4  points. 

That  such  an  argument  does  not  go  to  the  heart  of  the  issue  and  hence 
would  have  failed  to  convince  Russell  can  be  seen  from  the  following: 
The  congruence  relation  for  line  segments,  and  correspondingly  for 
regions  of  surfaces  and  of  3-space,  is  a  reflexive,  symmetrical  and  tran- 
sitive relation  in  these  respective  classes  of  geometrical  configurations. 
Thus,  congruence  is  a  kind  of  equality  relation.  Now  suppose  that  one 
believes,  as  Russell  and  Helmholtz  thought  they  could  believe  justifiably, 
that  the  spatial  equality  obtaining  between  congruent  line  segments 
consists  in  their  each  containing  the  same  intrinsic  amount  of  space.  Then 
one  will  maintain  that  in  any  physico-spatial  interpretation  of  an  abstract 
geometrical  calculus,  it  is  never  legitimate  to  choose  arbitrarily  what 
specific  line  segments  arc  going  to  be  called  "congruent".  And,  by 
the  same  token,  one  will  assert  that  in  Tarski's  aforementioned  axio- 
matization,  it  is  never  arbitrary  what  quartets  of  physical  points  are  to  be 
regarded  as  the  denotata  of  his  quaternary  equidistance  predicate  d. 
Instead  the  imputation  of  an  intrinsic  metric  to  the  extended  continua  of 
space  and  time  will  issue  in  the  following  contentions:  (i)  since  only 
"truly  equal"  intervals  may  be  called  "congruent",  Newton  [18,  pp.  6-8J 
was  right  in  insisting  that  there  is  only  one  true  metrization  of  the  time 
continuum,  and  (ii)  there  is  no  room  for  choice  as  to  the  lines  which  are  to 
be  called  "straight"  and  hence  no  choice  among  alternative  metric 
geometries  of  physical  space,  since  the  geodesic  requirement  dfds  —  0, 
which  must  be  satisfied  by  the  straight  lines,  is  imposed  subject  to  the 
restriction  that  only  intrinsically  congruent  line  elements  may  be  assigned 
the  same  length  ds. 

These  considerations  show  that  it  will  not  suffice  in  this  context  simply 
to  take  the  model-theoretic  conception  of  geometry  for  granted  and  there- 
by to  dismiss  the  Russell- Helmholtz  claim  peremptorily  in  favor  of  alter- 
native metrizability.  Rather  what  is  needed  is  a  refutation  of  the  Russell- 
Helmholtz  root-assumption  of  an  intrinsic  metric:  to  exhibit  the  un- 
tenability  of  that  assumption  is  to  provide  the  justification  of  the  model- 
theoretic  affirmation  that  a  given  set  of  physico-spatial  facts  may  be  held 


208  ADOLF   GRUNBAUM 

to  be  as  much  a  realization  of  a  Euclidean  calculus  as  of  a  wow-Euclidean 
one  yielding  the  same  topology. 

We  shall  now  see  how  Riemann  and  Poincare  furnished  the  philosophi- 
cal underpinning  for  that  affirmation. 

The  following  statement  in  Riemann's  Inaugural  Dissertation  [24,  pp. 
274,  286]  contains  a  fundamental  insight  into  the  particular  character  of 
the  continuous  manifolds  of  space  and  time : 

"Definite  parts  of  a  manifold,  which  are  distinguished  from  one 
another  by  a  mark  or  boundary  are  called  quanta.  Their  quantitative 
comparison  is  effected  by  means  of  counting  in  the  case  of  discrete 
magnitudes  and  by  measurement  in  the  case  of  continuous  ones.  2 
Measurement  consists  in  bringing  the  magnitudes  to  be  compared 
into  coincidence;  for  measurement,  one  therefore  needs  a  means 
which  can  be  applied  (transported)  as  a  standard  of  magnitude.  If  it 
is  lacking,  then  two  magnitudes  can  be  compared  only  if  one  is  a 
[proper]  part  of  the  other  and  then  only  according  to  more  or  less, 
not  with  respect  to  how  much.  ...  in  the  case  of  a  discrete  manifold, 
the  principle  [criterion]  of  the  metric  relations  is  already  implicit 
in  [intrinsic  to]  the  concept  of  this  manifold,  whereas  in  the  case  of  a 
continuous  manifold,   it  must  be  brought  in  from  elsewhere  [ex- 
trinsicallyj.  Thus,  either  the  reality  underlying  space  must  form  a 
discrete  manifold  or  the  reason  for  the  metric  relations  must  be 
sought  extrinsically  in  binding  forces  which  act  on  the  manifold." 
Russell  [28,  pp.  66-67]  and  the  writer  [13]  have  noted  that,  contrary  to 
Riemann's  apparent  expectation,  the  first  part  of  this  statement  will  not 
bear  critical  scrutiny  as  a  characterization  of  continuous  manifolds  in 
general.  Riemann  does,  however,  render  here  a  fundamental  feature  of 
the  continua  of  physical  space  and  time,  which  are  manifolds  whose 
elements,  taken  singly,  all  have  zero  magnitude.  And  since  our  concern  is 
with  the  geo-chronometry  of  continuous  physical  space  and  time,  we  can 
disregard  defects  in  his  account  which  do  not  affect  its  pertinence  to  the 
latter  continua.  By  the  same  token,  we  can  ignore  inadequacies  arising 
from  his  treatment  of  discrete  and  continuous  types  of  order  as  jointly 
exhaustive.  Instead,  we  state  the  valid  upshot  of  his  conception  relevant 
to  the  spatio-temporal  congruence  issue  before  us.  Construing  his  state- 
ment as  applying,  not  only  to  lengths  but  also,  mutatis  mutandis,  to 
areas   and  to   volumes  of  higher   dimensions,   he  gives  the   following 

2  Riemann  apparently  does  not  consider  sets  which  are  neither  discrete  nor 
continuous,  but  we  shall  consider  the  significance  of  that  omission  below. 


CONVENTIONALISM  IN  GEOMETRY  209 

sufficient  condition  for  the  intrinsic  definability  and  non-definability 
of  a  metric  without  claiming  it  to  be  necessary  as  well:  in  the  case  of  a 
discretely-ordered  set,  the  "distance"  between  two  elements  can  be 
defined  intrinsically  in  a  rather  natural  way  by  the  cardinality  of  the 
"interval"  determined  by  these  elements.  3  On  the  other  hand,  upon 
confronting  the  extended  continuous  manifolds  of  physical  space  and 
time,  we  see  that  neither  the  cardinality  of  intervals  nor  any  of  their 
other  topological  properties  provide  a  basis  for  an  intrinsically-defined 
metric.  The  first  part  of  this  conclusion  was  tellingly  emphasized  by 
Cantor's  proof  of  the  equi-cardinality  of  all  positive  intervals  independent- 
ly of  their  length.  Thus,  there  is  no  intrinsic  attribute  of  the  space  between 
the  end-points  of  a  line-segment  AB,  or  any  relation  between  these  two 
points  themselves,  in  virtue  of  which  the  interval  AB  could  be  said  to 
contain  the  same  amount  of  space  as  the  space  between  the  termini  of 
another  interval  CD  not  coinciding  with  ^47?.  Corresponding  remarks 
apply  to  the  time  continuum.  Accordingly,  the  continuity  we  postulate 
for  physical  space  and  time  furnishes  a  sufficient  condition  for  their 
intrinsic  metrical  amorphousness.  4 

3  The  basis  for  the  discrete  ordering  is  not  here  at  issue :  it  can  be  conventional, 
as  in  the  case  of  the  letters  of  the  alphabet,  or  it  may  arise  from  special  properties 
and  relations  characterizing  the  objects  possessing  the  specified  order. 

4  Clearly,  this  does  not  preclude  the  existence  of  sufficient  conditions  other  than 
continuity  for  the  intrinsic  metrical  amorphousness  of  sets.  But  one  cannot  invoke 
densely-ordered,  denumerable  sets  of  points  (instants)  in  an  endeavor  to  show  that 
discontinuous  sets  of  such  elements  may  likewise  lack  an  intrinsic  metric :  even 
without  measure  theory,  ordinary  analytic  geometry  allows  the  deduction  that  the 
length  of  a  demtmerably  infinite  point  set  is  intrinsically  zero.  This  result  is  evident 
from  the  fact  that  since  each  point  (more  accurately,  each  unit  point  set  or  degener- 
ate subinterval)  has  length  zero,  we  obtain  zero  as  the  intrinsic  length  of  the  densely- 
ordered  denumerable  point  set  upon  summing,  in  accord  with  the  usual  limit 
definition,  the  sequence  of  zero  lengths  obtainable  by  denumcration  (cf .  Griinbaum 
[1 1,  pp.  297-298]).  More  generally,  the  measure  of  a  denumerable  point  set  is  always 
zero  (cf.  Hobsoii  [15,  p.  166])  unless  one  succeeds  in  developing  a  very  restrictive 
intuition istic  measure  theory  of  some  sort. 

These  considerations  show  incidentally  that  space-intervals  cannot  be  held  to  be 
merely  denumerable  aggregates.  Hence  in  the  context  of  our  post-Cantorcan  mean- 
ing of  "continuous",  it  is  actually  not  as  damaging  to  Riemann's  statement  as  it 
might  seem  prima  facie  that  he  neglected  the  denumerable  dense  sets  by  incorrectly 
treating  the  discrete  and  continuous  types  of  order  as  jointly  exhaustive.  Moreover, 
since  the  distinction  between  denumerable  and  super-deiiumerable  dense  sets  was 
almost  certainly  unknown  to  Riemann,  it  is  likely  that  by  "continuous"  he  merely 
intended  the  property  which  we  now  call  "dense".  Evidence  of  such  an  earlier  usage 
of  "continuous"  is  found  as  late  as  1914:  cf.  Russell  [27,  p.  138]. 


210  ADOLF   GRUNBAUM 

The  axioms  of  congruence  [35,  pp.  42-50]  preempt  "congruent''  to  be  a 
spatial  equality  predicate  but  allow  an  infinitude  of  mutually-exclusive 
congruence  classes  of  intervals.  There  are  no  intrinsic  metric  attributes  of 
intervals,  however,  which  could  be  invoked  to  single  out  one  of  these 
congruence  classes  as  unique.  Hence  only  the  choice  of  a  particular 
extrinsic  congruence  standard  can  determine  a  unique  congruence  class, 
the  rigidity  of  that  standard  under  transport  being  decreed  by  convention. 
And  thus  the  role  of  this  standard  cannot  be  construed  with  Russell  to  be 
the  mere  ascertainment  of  an  otherwise  intrinsic  equality  obtaining 
between  the  intervals  belonging  to  the  congruence  class  defined  by  it. 
Similarly  for  time  intervals  and  the  periodic  devices  which  define  temporal 
congruence.  And  hence  there  can  be  no  question  at  all  of  an  empirically  or 
factually  determinate  metric  geometry  or  chronometry  until  after  a 
physical  stipulation  of  congruence.5 

A  concluding  remark  on  the  special  importance  of  the  equality  term 
"congruent"  (for  line  segments)  vis-a-vis  the  other  primitives  of  the 
calculus  will  precede  turning  our  attention  to  some  of  the  import  of  the 
conventionality  of  congruence. 

Suitable  alternative  semantical  interpretations  of  the  term  "con- 
gruent", and  correlatively  of  "straight  line/'  can  readily  demonstrate 
that,  subject  to  the  restrictions  imposed  by  the  existing  topology,  it  is 
always  a  live  option  to  give  either  a  Euclidean  or  a  non-Euclidean  de- 
scription of  the  same  body  of  physico-geometrical  facts.  The  possibility 
of  alternative  semantical  interpretations  of  such  other  primitives  of  rival 
geometrical  calculi  as  "point"  does  not  generally  have  such  relevance  to 
this  demonstration.  Accordingly,  when  one  is  concerned,  as  we  are  here, 
with  noting  that,  even  apart  from  the  logic  of  induction,  the  empirical 
facts  themselves  do  not  uniquely  dictate  the  truth  of  either  Euclidean 
geometry  or  of  one  of  its  non-Euclidean  rivals,  then  the  situation  is  as 
follows:  the  different  physical  interpretations  of  the  term  "congruent" 
(and  hence  of  "straight  line")  in  the  respective  geometrical  calculi  enjoy  a 
more  central  importance  in  the  discussion  than  the  semantics  of  such 
other  primitives  of  these  calculi  as  "point,"  since  the  latter  generally  have 
the  same  physical  meaning  in  both  the  Euclidean  and  non-Euclidean  de- 
scriptions. Moreover,  once  we  cease  to  look  at  physical  geometry  as  a 
descriptively-interpreted  system  of  abstract  synthetic  geometry  and  regard 
it  instead  as  an  interpreted  system  of  abstract  differential  geometry  of  the 

5  For  a  detailed  critique  of  A.  N.  Whitehead's  perceptualistic  objections  to  this 
conclusion  [34,  ch.  VI;  35,  ch.  Ill;  36,  passim]  see  Griinbaum  [13]. 


CONVENTIONALISM  IN  GEOMETRY  211 

Gauss-Riemann  type,  the  pre-eminent  status  of  the  interpretation  of 
"congruent"  is  seen  to  be  beyond  dispute:  by  choosing  a  particular 
distance  function  ds  =  Vgtjcdxidxk  for  the  line  element,  we  specify  not 
only  what  segments  are  congruent  and  what  lines  are  straights  (geodesies) 
but  the  entire  geometry,  since  the  metric  tensor  g^  fully  determines  the 
Gaussian  curvature  K.  To  be  sure,  if  one  were  discussing  not  the  alter- 
native between  a  Euclidean  and  non-Euclidean  description  of  the  same 
spatial  facts  but  rather  the  set  of  all  models  (including  wow-spatial  ones)  of 
a  given  calculus,  say  the  Euclidean  one,  then  indeed  the  physical  inter- 
pretation of  "congruent"  and  of  "straight  line"  would  not  merit  any  more 
attention  than  that  of  other  primitives  like  "point". 
The  Import  of  Riemann's  Conception  of  Congruence. 

(a)  F.  Klein's  Relative  Consistency  Proof  of  Hyperbolic  Geometry  and 
H.  Poincare*'s  Anschaulichkeitsbeweis  of  that  geometry. 

In  the  light  of  the  conventionality  of  congruence,  F.  Klein's  relative 
consistency  proof  of  hyperbolic  geometry  via  a  model  furnished  by  the 
interior  of  a  circle  on  the  Euclidean  plane  6  appears  as  merely  one  par- 
ticular kind  of  possible  remetrization  of  the  circular  portion  of  that  plane, 
protective  geometry  having  played  the  heuristic  role  of  furnishing  Klein 
with  a  suitable  definition  of  congruence.  What  from  the  point  of  view  of 
synthetic  geometry  appears  as  intertranslatability  via  a  dictionary,  appears 
as  alternative  metrizability  from  the  point  of  view  of  differential  geometry. 
Again,  Poincarc's  kind  of  Anschaulichkeitsbeweis  of  a  three-dimensional 
hyperbolic  geometry  via  a  model  furnished  by  the  interior  of  a  sphere  in 
Euclidean  space  [20,  pp.  75-8]  is  another  example  of  remetrization.  Here 
the  alteration  in  our  customary  definition  of  congruence  is  conveyed  to 
us  pictorially  by  the  effects  of  an  inhomogeneous  force  field  which 
appropriately  shrinks  all  bodies  alike  as  seen  from  the  point  of  view  of  the 
normally  Euclideanly-behaving  bodies. 

(b)  Poincare  and  the  Conventionality  of  Congruence. 

The  central  theme  of  Poincare's  so  called  conventionalism  is  essentially 
an  elaboration  of  the  thesis  of  alternative  metrizability  whose  fundamen- 
tal justification  we  owe  to  Riemann,  and  not  [12,  §5]  the  radical  con- 
ventionalism attributed  to  him  by  Reichenbach  [23,  p.  36]. 

Poincare's  much-cited  and  often  misunderstood  statement  concerning  the 
possibility  of  always  giving  a  Euclidean  description  of  any  results  of  stellar 
parallax  measurements  is  a  less  lucid  statement  of  exactly  the  same  point 

6  For  details,  cf.  Bonola  [1,  pp.  164-175].  For  a  summary  of  E.  Beltrami's  differ- 
ent relative  consistency  proof,  see  Struik  [31,  pp.  152-3]. 


212  ADOLF   GRUNBAUM 

made  by  him  with  magisterial  clarity  in  the  following  passage  [20,  p.  235] : 
"In  space  we  know  rectilinear  triangles  the  sum  of  whose  angles  is 
equal  to  two  right  angles ;  but  equally  we  know  curvilinear  triangles 
the  sum  of  whose  angles  is  less  than  two  right  angles.  ...  To  give  the 
name  of  straights  to  the  sides  of  the  first  is  to  adopt  Euclidean  geome- 
try;  to  give  the  name  of  straights  to  the  sides  of  the  latter  is  to  adopt 
the  non-Euclidean  geometry.  So  that  to  ask  what  geometry  it  is  proper 
to  adopt  is  to  ask,  to  what  line  is  it  proper  to  give  the  name  straight  ? 
It  is  evident  that  experiment  can  not  settle  such  a  question." 
Now,  the  equivalence  of  this  contention  to  Riemann's  view  of  con- 
gruence becomes  evident  the  moment  we  note  that  the  legitimacy  of 
identifying  lines  which  are  curvilinear  in  the  usual  geometrical  parlance 
as  "straights"  is  vouchsafed  by  the  warrant  for  our  choosing  a  new  defi- 
nition of  congruence  such  that  the  previously  curvilinear  lines  become 
geodesies  of  the  new  congruence.  Corresponding  remarks  apply  to  Pom- 
care's  contention  that  we  can  always  preserve  Euclidean  geometry  in 
the  face  of  any  data  obtained  from  stellar  parallax  measurements:  if  the 
paths  of  light  rays  are  geodesies  on  a  particular  definition  of  congruence, 
as  indeed  they  are  in  the  Schwarzschild  procedure  cited  by  Robertson, 
and  if  the  paths  of  light  rays  are  found  parallactically  to  sustain  non- 
Euclidean  relations  on  that  metrization,  then  we  need  only  choose  a 
different  definition  of  congruence  such  that  these  same  paths  will  no 
longer  be  geodesies  and  that  the  geodesies  of  the  newly  chosen  congruence 
are  Euclideanly  related.  From  the  standpoint  of  synthetic  geometry,  the 
latter  choice  effects  a  renaming  of  optical  and  other  paths  and  thus  is 
merely  a  recasting  of  the  same  factual  content  in  Euclidean  language 
rather  than  a  revision  of  the  extra-linguistic  content  of  optical  and  other 
laws1.  Since  Poincarc's  claim  here  is  a  straightforward  elaboration  of  the 
metric  amorphousness  of  the  continuous  manifold  of  space,  it  is  not  clear 
how  Robertson  can  reject  it  as  a  "pontifical  pronouncement"  and  even 
regard  it  as  being  in  contrast  with  what  he  calls  Schwarzschild's  "sound 
operational  approach  to  the  problem  of  physical  geometry."  [25,  pp. 
324-5].  For  Schwarzschild  had  rendered  the  question  concerning  the 
prevailing  geometry  factual  only  by  the  adoption  of  a  particular  spatial 

7  The  remetrizational  retainability  of  Euclidcanism  affirmed  by  Poincar6  [20, 
pp.  81-86]  thus  involves  a  merely  linguistic  interdependence  of  the  geometric  theory 
of  rigid  solids  and  the  optical  theory  of  light  rays.  This  interdependence  is  logically 
different,  as  we  shall  see  in  Section  3,  from  P.  Duhem's  conception  [6,  Part  II,  ch. 
VI]  of  an  epistemological  interdependence,  which  Einstein  espouses. 


CONVENTIONALISM  IN  GEOMETRY  213 

metrization  based  on  the  travel  times  of  light,  which  does  indeed  turn  the 
direct  light  paths  of  his  astronomical  triangle  into  geodesies. 

There  are  two  respects,  however,  in  which  Poincare  is  open  to  criticism 
in  this  connection : 

(i)  He  maintained  [20,  p.  81]  that  it  would  always  be  regarded  as  most 
convenient  to  preserve  Euclidean  geometry,  even  at  the  price  of  re- 
metrization,  on  the  grounds  that  this  geometry  is  the  simplest  ana- 
lytically [20,  p.  65].  Precisely  the  opposite  development  materialized  in 
the  general  theory  of  relativity:  Einstein  forsook  the  simplicity  of  the 
geometry  itself  in  the  interests  of  being  able  to  maximize  the  simplicity 
of  the  definition  of  congruence.  He  makes  clear  in  his  fundamental  paper 
of  1916  that  had  he  insisted  on  the  retention  of  Euclidean  geometry  in  a 
gravitational  field,  then  he  could  not  have  taken  "one  and  the  same  rod, 
independently  of  its  place  and  orientation,  as  a  realization  of  the  same 
interval."  [8,  p.  161] 

(ii)  Even  if  the  simplicity  of  the  geometry  itself  were  the  sole  determi- 
nant of  its  adoption,  that  simplicity  might  be  judged  by  criteria  other 
than  Poincarc's  analytical  simplicity.  Thus,  Menger  has  urged  that 
from  the  point  of  view  of  a  criterion  grounded  on  the  simplicity  of  the 
undefined  concepts  used,  hyperbolic  and  not  Euclidean  geometry  is  the 
simplest  [16,  p.  66J. 

On  the  other  hand,  if  Poincare  were  alive  today,  he  could  point  to  an 
interesting  recent  illustration  of  the  sacrifice  of  the  simplicity  and 
accessibility  of  the  congruence  standard  on  the  altar  of  maximum 
simplicity  of  the  resulting  theory.  Astronomers  have  recently  proposed 
to  remetrize  the  time  continuum  for  the  following  reason :  when  the  mean 
solar  second,  which  is  a  very  precisely  known  fraction  of  the  period  of 
the  earth's  rotation  on  its  axis,  is  used  as  a  standard  of  temporal  con- 
gruence, then  there  are  three  kinds  of  discrepancies  between  the  actual 
observational  findings  and  those  predicted  by  the  usual  theory  of  celestial 
mechanics.  The  empirical  facts  thus  present  astronomers  with  the  follow- 
ing choice:  Either  they  retain  the  rather  natural  standard  of  temporal 
congruence  at  the  cost  of  having  to  bring  the  principles  of  celestial 
mechanics  into  conformity  with  observed  fact  by  revising  them  appropri- 
ately. Or  they  remetrize  the  time  continuum,  employing  a  less  simple 
definition  of  congruence  so  as  to  preserve  these  principles  intact.  Decisions 
taken  by  astronomers  in  the  last  few  years  were  exactly  the  reverse  of 
Einstein's  choice  of  1916  as  between  the  simplicity  of  the  standard  of 
congruence  and  that  of  the  resulting  theory.  The  mean  solar  second  is  to 


214  ADOLF   GRUNBAUM 

be  supplanted  by  a  unit  to  which  it  is  non-linearly  related:  the  sidereal 
year,  which  is  the  period  of  the  earth's  revolution  around  the  sun,  due 
account  being  taken  of  the  irregularities  produced  by  the  gravitational 
influence  of  the  other  planets.  8 

We  see  that  the  implementation  of  the  requirement  of  descriptive 
simplicity  in  theory-construction  can  take  alternative  forms,  because 
agreement  of  astronomical  theory  with  the  evidence  now  available  is 
achievable  by  revising  either  the  definition  of  temporal  congruence  or  the 
postulates  of  celestial  mechanics.  The  existence  of  this  alternative  likewise 
illustrates  that  for  an  axiomatized  physical  theory  containing  a  geo- 
chronometry,  it  is  gratuitous  to  single  out  the  postulates  of  the  theory  as 
having  been  prompted  by  empirical  findings  in  contradistinction  to 
deeming  the  definitions  of  congruence  to  be  wholly  a  priori,  or  vice  versa. 
This  conclusion  bears  out  geochronometrically  Braithwaite's  contention 
in  this  volume  that  there  is  an  important  sense  in  which  axiomatized 
physical  theory  does  not  lend  itself  to  compliance  with  Heinrich  Hertz's 
injunction  to  "distinguish  thoroughly  and  sharply  between  the  ele- 
ments . . .  which  arise  from  the  necessities  of  thought,  from  experience, 
and  from  arbitrary  choice/'  [14,  p.  8].  9 

(c)  The  impossibility  of  defining  congruence  uniquely  by  stipulating  a 
particular  metric  geometry. 

A  question  which  arises  naturally  upon  undertaking  the  mathematical 
implementation  of  a  given  choice  of  a  metric  geometry  in  the  context  of  a 
particular  set  of  topological  facts  is  the  following:  do  these  facts  in  con- 
junction with  the  desired  metric  geometry  determine  a  unique  definition 
of  congruence?  If  the  answer  were  actually  in  the  affirmative,  as  both 
Carnap  [3,  pp.  54-55]  and  Reichenbach  [23,  pp.  33-34;  22,  pp.  132-133] 
have  maintained,  this  would  mean  that  the  desired  geometry  would 
uniquely  specify  a  metric  tensor  under  given  factual  circumstances  and 
thus,  in  a  particular  coordinate  system,  a  unique  set  of  functions  guc. 
But  Carnap's  and  Reichenbach's  assertion  of  uniqueness  is  erroneous,  as 
is  demonstrated  by  showing  that  besides  the  customary  definition  of 
congruence,  which  assigns  the  same  length  to  the  measuring  rod  every- 
where and  thereby  confers  a  Euclidean  geometry  on  an  ordinary  table  top, 
there  are  infinitely  many  other  definitions  of  congruence  which  likewise 

8  For  a  clear  account  of  the  relevant  astronomical  details,  see  Clemence  [4]. 

9  Braithwaite's  point  was  made  independently  by  Pap  [19],  who  argues  that  the 
analytic-synthetic  distinction  cannot  be  upheld  for  partially -interpreted  theoretical 
languages  like  that  of  theoretical  physics. 


CONVENTIONALISM  IN  GEOMETRY  215 

yield  a  Euclidean  geometry  for  that  surface  but  which  make  the  length  of  a 
rod  depend  on  its  orientation  or  position.  Thus,  consider  our  horizontal 
table  top  equipped  with  a  net-work  of  Cartesian  coordinates  x  and  y  and 
suppose  that  another  such  surface  intersects  the  horizontal  one  at  an 
angle  0  so  that  their  line  of  intersection  is  both  the  y-axis  of  the  horizontal 
plane  and  the  J7-axis  of  a  rectangular  system  of  coordinates  x  and  y  on  the 
inclined  plane.  Assume  that  the  inclined  plane  has  been  metrized  in  the 
customary  way.  But  then  remetrize  the  horizontal  plane  by  calling  con- 
gruent in  it  those  line  segments  which  are  the  perpendicular  projections 
onto  it  of  segments  of  the  inclined  plane  that  are  equal  in  the  latter's 
metric.  Accordingly,  we  have  a  mapping 

x  =  x  sec  0 

y  =  y> 

and  we  now  assign  to  a  line  segment  of  the  horizontal  plane  whose  termini 
have  the  coordinate  differences  dx  and  dy  not  the  customary  length 
Vdx*  +  dy2  but  rather 

ds  =  Vdx*  +~d$*  =  Vsec2  Odx*  +  dy2. 

Nonetheless,  upon  using  the  new  gM,  which  are  introduced  into  the  x,  y 
coordinates  by  the  revised  definition  of  congruence,  to  compute  the 
Gaussian  curvature  of  the  horizontal  table  top,  we  still  obtain  the  Eucli- 
dean value  zero.  And  by  merely  varying  the  angle  of  inclination  0,  we  ob- 
tain infinitely  many  different  definitions  of  congruence  all  of  which  make 
the  length  of  a  given  rod  dependent  on  its  orientation  and  yet  impart  a 
Euclidean  geometry  to  the  horizontal  table  top.  Thus,  the  requirement 
of  Euclideanism  does  not  uniquely  determine  a  metric  tensor,  and, 
contrary  to  Carnap  and  Reichenbach,  there  are  infinitely  many  ways  in 
which  a  measuring  rod  could  squirm  under  transport  as  compared  to  its 
customary  behavior  and  still  yield  a  Euclidean  geometry.  In  fact,  even  for 
plane  Euclidean  geometry,  the  class  of  congruence  definitions  is  far  wider 
than  the  one-parameter  family  yielded  by  our  particular  isometric  map- 
pings of  an  inclined  plane  onto  the  horizontal  one.  Dr.  Samuel  Gulden, 
to  whom  I  presented  the  problem  of  determining  the  class  of  different 
metric  tensors  for  each  kind  of  two-dimensional  and  three-dimensional 
Riemannian  space,  has  pointed  out  that  (i)  in  the  Euclidean  case,  upon 
abandoning  the  restriction  of  our  above  isometric  mappings  to  affine 
coordinate  transformations  and  considering  non-linear  transformations 
with  non-vanishing  Jacobian,  we  can  generate  infinitely  many  other 


216  ADOLF   GRUNBAUM 

metrizations  whose  associated  Gaussian  curvature  is  everywhere  zero. 
For  example,  for  the  admissible  transformation  between  our  two  sets  of 
rectangular  coordinates  x,  y  and  % ,  y  given  by 

x  =  x  +  %y3,  and 

y  =  b3  —  y> 

the  distance  function  becomes 

ds*  =  dx*  +  dy*  =  (1  +  x*)dx*  +  2(y2  —  x*)dxdy  +  (y4  +  \)dy*. 

In  this  case,  the  length  of  a  given  rod  is  generally  dependent  both  on 
its  position  and  on  its  orientation,  (ii)  the  result  obtained  for  Euclidean 
space  can  be  generalized  to  a  very  large  class  of  Ricmann  spaces  of 
various  dimensions. 

We  are  now  ready  to  consider  the  second  of  the  two  principal  problems 
which  have  been  posed  in  connection  with  the  criterion  of  rigidity. 

3.  The  Criterion  of  Rigidity :  II.  The  Logic  of  Correcting  for  "Distorting" 
Influences.  Physical  geometry  is  usually  conceived  as  the  system  of  metric 
relations  exhibited  by  transported  solid  bodies  independently  of  their 
particular  chemical  composition.  On  this  conception,  the  criterion  of 
congruence  can  be  furnished  by  a  transported  solid  body  for  the  purpose 
of  determining  the  geometry  by  measurement,  only  if  the  computational 
application  of  suitable  "corrections"  (or,  ideally,  appropriate  shielding) 
has  essentially  eliminated  inhomogeneous  thermal,  elastic,  electric  and 
other  influences,  which  produce  changes  of  varying  degree  ("distortions") 
in  different  kinds  of  materials.  The  demand  for  this  elimination  as  a 
prerequisite  to  the  experimental  determination  of  the  geometry  has  a 
thermodynamic  counterpart :  the  requirement  of  a  means  for  measuring 
temperature  which  does  not  yield  the  discordant  results  produced  by  ex- 
pansion thermometers  at  other  than  fixed  points  when  different  thermo- 
metric  substances  are  employed.  This  thermometric  need  is  fulfilled 
successfully  by  Kelvin's  thermodynamic  scale  of  temperature.  But 
attention  to  the  implementation  of  the  corresponding  prerequisite  of 
physical  geometry  has  led  Einstein  [9,  pp.  676-678]  to  impugn  the  em- 
pirical status  of  that  geometry.  He  considers  the  case  in  which  congruence 
has  been  defined  by  the  diverse  kinds  of  transported  solid  measuring  rods 
as  corrected  for  their  respective  idiosyncratic  distortions  with  a  view  to  then 
making  an  empirical  determination  of  the  prevailing  geometry.  And  in  an 


CONVENTIONALISM  IN  GEOMETRY  217 

argument  which  he  attributes  to  Poincare,  Einstein's  thesis  is  that  the 
very  logic  of  computing  these  corrections  precludes  that  the  geometry 
itself  be  accessible  to  experimental  ascertainment  in  isolation  from  other 
physical  regularities.  Specifically,  he  states  the  case  in  the  form  of  a 
dialogue  between  Reichenbach  and  Poincare  10: 

"Poincare:  The  empirically  given  bodies  are  not  rigid,  and  conse- 
quently can  not  be  used  for  the  embodiment  of  geometric  intervals. 
Therefore,  the  theorems  of  geometry  are  not  verifiable. 
Reichenbach:  I  admit  that  there  are  no  bodies  which  can  be  im- 
mediately adduced  for  the  "real  definition"  of  the  interval.  Never- 
theless, this  real  definition  can  be  achieved  by  taking  the  thermal 
volume-dependence,  elasticity,  electro-  and  magneto-striction,  etc., 
into  consideration.  That  this  is  really  [and]  without  contradiction 
possible,  classical  physics  has  surely  demonstrated. 
Poincare:  In  gaining  the  real  definition  improved  by  yourself  you 
have  made  use  of  physical  laws,  the  formulation  of  which  presupposes 
(in  this  case)  Euclidean  geometry.  The  verification,  of  which  you 
have  spoken,  refers,  therefore,  not  merely  to  geometry  but  to  the 
entire  system  of  physical  laws  which  constitute  its  foundation.  An 
examination  of  geometry  by  itself  is  consequently  not  thinkable. 
—  Why  should  it  consequently  not  be  entirely  up  to  me  to  choose 
geometry  according  to  my  own  convenience  (i.e.,  Euclidean)  and  to 
fit  the  remaining  (in  the  usual  sense  "physical")  laws  to  this  choice 
in  such  manner  that  there  can  arise  no  contradiction  of  the  whole 
with  experience?" 

The  objection  which  Einstein  presents  here  on  behalf  of  conventionalism 
is  aimed  at  a  conception  of  physical  geometry  which  is  empiricist  merely 
in  Carnap's  and  Reichenbach's  conditional  sense  explained  in  Section  1 . 
Einstein's  criticism  is  that  the  rigid  body  is  not  even  defined  without 
first  decreeing  the  validity  of  Euclidean  geometry.  And  the  grounds  he 
gives  for  this  conclusion  are  that  before  the  corrected  rod  can  be  used  to 
make  an  empirical  determination  of  the  de  facto  geometry,  the  required 
corrections  must  be  computed  via  laws,  such  as  those  of  elasticity,  which 
involve  Euclideanly-calculated  areas  and  volumes.  But  clearly  the  warrant 

10  It  is  rather  doubtful  that  Poincare*  himself  espoused  the  version  of  convention- 
alism which  Einstein  links  to  his  name  here:  in  speaking  of  the  variations  which 
solids  exhibit  under  distorting  influences,  Poincar6  says  [20,  p.  76  J :  "we  neglect  these 
variations  in  laying  the  foundations  of  geometry,  because,  besides  their  being  very 
slight,  they  are  irregular  and  consequently  seem  to  us  accidental." 


218  ADOLF   GRUNBAUM 

for  thus  introducing  Euclidean  geometry  at  this  stage  cannot  be  empirical. 

I  now  wish  to  set  forth  my  reasons  for  believing  that  Einstein's  argu- 
ment does  not  succeed  in  making  physical  geometry  a  matter  of  con- 
vention rather  than  fact  in  a  sense  which  is  independent  of  the  alternative 
metrizability  vouchsafed  by  spatio-temporal  continuity. 

There  is  no  question  that  the  laws  used  to  make  the  corrections  for 
deformations  [30,  p.  60;  32,  p.  408]  involve  areas  and  volumes  in  a  funda- 
mental way  (e.g.  in  the  definitions  of  the  elastic  stresses  and  strains)  and 
that  this  involvement  presupposes  a  geometry,  as  is  evident  from  the  area 
and  volume  formulae 

A=f^/g dx^dx*  and  V  =f  ^g dxidx*dx*, 

where  "g"  represents  the  determinant  of  the  components  gw  [10,  p.  177]. 
Now  suppose  that  we  begin  with  a  set  of  Euclideanly-formulated  physical 
laws  PQ  in  correcting  for  the  distortions  induced  by  perturbations  and 
then  use  the  thus  Euclideanly-corrected  congruence  standard  for  empiri- 
cally exploring  the  geometry  of  space  by  determining  the  metric  tensor. 
The  initial  stipulational  affirmation  of  the  Euclidean  geometry  Go  in  the 
physical  laws  PQ  used  to  compute  the  corrections  in  no  way  assures  that 
the  geometry  obtained  by  the  corrected  rods  will  be  Euclidean!  If  it  is  non- 
Euclidean,  then  the  question  is:  what  will  Einstein's  fitting  of  the 
physical  laws  to  preserve  Euclideanism  and  avoid  a  contradiction  of  the 
total  theoretical  system  with  experience  involve?  Will  the  adjustments  in 
PQ  necessitated  by  the  retention  of  Euclideanism  entail  merely  a  change 
in  the  dependence  of  the  length  assigned  to  the  transported  rod  on  such 
non-positional  parameters  as  temperature,  pressure,  magnetic  field  etc.  ? 
Or  could  the  putative  empirical  findings  compel  that  the  length  of  the 
transported  rod  be  likewise  made  a  function  of  its  position  and  orientation 
in  order  to  square  the  coincidence  findings  with  the  requirement  of 
Euclideanism?  The  temporal  variability  of  distorting  influences  and  the 
possibility  of  obtaining  non-Euclidean  results  by  measurements  carried 
out  in  a  spatial  region  uniformly  characterized  by  standard  conditions  of 
temperature,  pressure,  electric  and  magnetic  field  strength  etc.  show  it  to 
be  quite  doubtful  that  the  preservation  of  Euclideanism  could  always  be 
accomplished  short  of  introducing  the  dependence  of  the  rod's  length  on 
position  and  orientation.  Thus,  the  need  for  remetrizing  in  this  sense  in 
order  to  retain  Euclideanism  cannot  be  ruled  out.  But  this  kind  of  re- 
metrization  does  not  provide  the  requisite  support  for  Einstein's  version 
of  conventionalism,  whose  onus  it  is  to  show  that  the  geometry  by  itself 


CONVENTIONALISM  IN  GEOMETRY  219 

cannot  be  held  to  be  empirical  even  when  we  exclude  resorting  to  such 
remetrization. 

That  the  geometry  may  well  be  empirical  in  this  sense  is  seen  from  the 
following  possibilities  of  its  successful  empirical  determination.  After 
assumcdly  obtaining  a  non-Euclidean  geometry  G\  from  measure- 
ments with  a  rod  corrected  on  the  basis  of  Euclideanly-formulated  physi- 
cal laws  PQ,  we  can  revise  jPo  so  as  to  conform  to  the  non-Euclidean 
geometry  GI  just  obtained  by  measurement.  This  retroactive  revision  of 
PQ  would  be  effected  by  recalculating  such  quantities  as  areas  and  vo- 
lumes on  the  basis  of  GI  and  changing  the  functional  dependencies 
relating  them  to  temperature  and  other  physical  parameters.  We  thus  ob- 
tain a  new  set  of  laws  PI.  Now  we  use  this  set  PI  of  laws  to  correct  the 
rods  for  perturbational  influences  and  then  determine  the  geometry  with 
the  thus  corrected  rods.  If  the  result  is  a  geometry  G%  different  from  GI, 
then  if  there  is  convergence  to  a  geometry  of  constant  curvature,  we  must 
repeat  this  process  a  finite  number  of  times  until  the  geometry  Gn 
ingredient  in  the  laws  Pn  providing  the  basis  for  perturbation-corrections 
is  indeed  the  same  to  within  experimental  accuracy  as  the  geometry 
obtained  by  measurements  with  rods  that  have  been  corrected  via  the  set 

Pn. 

If  there  is  such  convergence  at  all,  it  will  be  to  the  same  geometry  Gn 
even  if  the  physical  laws  used  in  making  the  initial  corrections  are  not  the 
set  PQ,  which  presupposes  Euclidean  geometry,  but  a  different  set  P 
based  on  some  wow-Euclidean  geometry  or  other.  That  there  can  exist  only 
one  such  geometry  of  constant  curvature  Gn  would  seem  to  be  guaranteed 
by  the  identity  of  Gn  with  the  unique  underlying  geometry  Gt  character- 
ized by  the  following  properties :  (i)  Gt  would  be  exhibited  by  the  coinci- 
dence behavior  of  a  transported  rod  if  the  whole  of  the  space  were  actually 
free  of  deforming  influences,  (ii)  Gt  would  be  obtained  by  measurements 
with  rods  corrected  for  distortions  on  the  basis  of  physical  laws  Pt 
presupposing  Gt,  and  (iii)  Gt  would  be  found  to  prevail  in  a  given  relatively 
small,  perturbation-free  region  of  the  space  quite  independently  of  the 
assumed  geometry  ingredient  in  the  correctional  physical  laws.  Hence,  if 
our  method  of  successive  approximation  does  converge  to  a  geometry  Gn 
of  constant  curvature,  then  Gn  would  be  this  unique  underlying  geometry 
Gt.  And,  in  that  event,  we  can  claim  to  have  found  empirically  that  Gt  is 
indeed  the  geometry  prevailing  in  the  entire  space  which  we  have  explored. 

But  what  if  there  is  no  convergence?  It  might  happen  that  whereas 
convergence  would  obtain  by  starting  out  with  corrections  based  on  the 


220  ADOLF   GRUNBAUM 

set  PO  of  physical  laws,  it  would  not  obtain  by  beginning  instead  with 
corrections  presupposing  some  particular  non-Euclidean  set  P  or  vice 
versa:  just  as  in  the  case  of  Newton's  method  of  successive  approximation 
[5,  p.  286],  there  are  conditions,  as  A.  Suna  has  pointed  out  to  me,  under 
which  there  would  be  no  convergence.  We  might  then  nonetheless 
succeed  as  follows  in  finding  the  geometry  Gt  empirically,  if  our  space  is 
one  of  constant  curvature. 

The  geometry  Gr  resulting  from  measurements  by  means  of  a  corrected 
rod  is  a  single-valued  function  of  the  geometry  Ga  assumed  in  the  cor- 
rectional physical  laws,  and  a  Laplacian  demon  having  sufficient  know- 
ledge of  the  facts  of  the  world  would  know  this  function  Gr  —  /  (Ga). 
Accordingly,  we  can  formulate  the  problem  of  determining  the  geometry 
empirically  as  the  problem  of  finding  the  point  of  intersection  between  the 
curve  representing  this  function  and  the  straight  line  Gr  —  Ga.  That  there 
exists  one  and  only  one  such  point  of  intersection  follows  from  the 
existence  of  the  geometry  Gt  defined  above,  provided  that  our  space  is 
one  of  constant  curvature.  Thus,  what  is  now  needed  is  to  make  determi- 
nations of  the  Gr  corresponding  to  a  number  of  geometrically-different 
sets  of  correctional  physical  laws  Pa,  to  draw  the  most  reasonable  curve 
Gr  =  /  (Ga)  through  this  finite  number  of  points  (Gn,  Gr),  and  then  to  find 
the  point  of  intersection  of  this  curve  and  the  straight  line  Gr  —  Ga. 

Whether  this  point  of  intersection  turns  out  to  be  the  one  representing 
Euclidean  geometry  or  not  is  beyond  the  reach  of  our  conventions, 
barring  a  remetrization.  And  thus  the  least  that  we  can  conclude  is  that 
since  empirical  findings  can  greatly  narrow  down  the  range  of  uncertainty 
as  to  the  prevailing  geometry,  there  is  no  assurance  of  the  latitude  for  the 
choice  of  a  geometry  which  Einstein  takes  for  granted.  Einstein's  Duhe- 
mian  position  would  appear  to  be  inescapable  only  if  our  proposed  method 
of  determining  the  geometry  by  itself  empirically  cannot  be  generalized  in 
some  way  to  cover  the  general  relativity  case  of  a  space  of  variable 
curvature  and  if  the  latter  kind  of  theory  turns  out  to  be  true. 

It  would  seem  therefore  that,  contrary  to  Einstein,  the  logic  of  elimi- 
nating distorting  influences  prior  to  stipulating  the  rigidity  of  a  solid  body 
is  not  such  as  to  provide  scope  for  the  ingression  of  conventions  over  and 
above  those  acknowledged  in  RiemamYs  analysis  of  congruence,  and  trivial 
ones  such  as  the  system  of  units  used.  Our  analysis  of  the  logical  status  of 
the  concept  of  a  rigid  body  thus  leads  to  the  conclusion  that  once  the 
physical  meaning  of  congruence  has  been  stipulated  by  reference  to  a 
solid  body  for  whose  distortions  allowance  has  been  made  compu- 


CONVENTIONALISM  IN  GEOMETRY  221 

tationally  as  outlined,  then  the  geometry  is  determined  uniquely  by  the 
totality  of  relevant  empirical  facts.  It  is  true,  of  course,  that  even  apart 
from  experimental  errors,  not  to  speak  of  quantum  limitations  on  the 
accuracy  with  which  the  metric  tensor  of  space-time  can  be  meaningfully 
ascertained  by  measurement  [29 ;  37J,  no  finite  number  of  data  can  unique- 
ly determine  the  functions  constituting  the  representations  guc  of  the 
metric  tensor  in  any  given  coordinate  system.  But  the  criterion  of  inductive 
simplicity  which  governs  the  free  creativity  of  the  geometer's  imagination 
in  his  choice  of  a  particular  metric  tensor  here  is  the  same  as  the  one 
employed  in  theory  formation  in  any  of  the  non-geometrical  portions  of 
empirical  science.  And  choices  made  on  the  basis  of  such  inductive 
simplicity  are  in  principle  true  or  false,  unlike  those  springing  from 
considerations  of  descriptive  simplicity,  which  merely  reflect  conventions. 
The  author  is  indebted  to  Dr.  Samuel  Gulden  of  the  Department  of 
Mathematics,  Lehigh  University,  U.S.A.  for  very  helpful  discussions. 


Bibliography 

[1]    BONOLA,  R.,  Non-Euclidean  Geometry.  New  York,   1955.  IX  +  268  pp. 

[2]     BROWN,  F.  A.,  Biological  clocks  and  the  fiddler  crab.  Scientific  American,  vol. 

190  (April,  1954),  pp.~34-37. 
[3]    CARNAP,  R.,  Der  Raiim.  Berlin,   1922  (Supplement  No.  56  of  Kant-Studien] 

87  pp. 
[4]    QLKMKNCTC,  G.  M.,   Time  and  its  measurement.   American  Scientist,  vol.  40 

(1952),  pp.  260—269;  and  Astronomical  time.  Reviews  of  Modern  Physics,  vol. 

29  (1957),  p.  5. 
[5]    COURANT,   R.,    Vorlesungen  uber  Differential-  und  Integralrechnung,  vol.    1. 

Berlin,  1927.  XIV    1    410pp. 
[6]     DUHKM,   P.,    The  Aim  and  Structure  of  Physical  Theory.   Princeton,    1954. 

XXII  +  344  pp. 
[7]    EDDINGTON,  A.  S.,  Space,  Time  and  Gravitation.  Cambridge,  1953.  VII  +  218 

pp. 
[8]    EINSTEIN,  A.,   The  foundations  of  the  general  theory  of  relativity.  In:  The 

Principle  of  Relativity,  a  collection  of  original  memoirs,  London,   1923,  pp. 

111-164. 
[9]    ,  Reply  to  criticisms.  In:  Albert  Einstein:  Philosopher-Scientist  (edited  by 

SCHILPP,  P.  A.)  Evanston,  1949,  pp.  665-688. 

[10]    EISENHART,  L.  P.,  Riemannian  Geometry.  Princeton,  1949.  VII  -j-  306  pp. 
[11]    GRUNBAUM,  A.,  A  consistent  conception  of  the  extended  linear  continuum  as  an 

aggregate  of  unextended  elements.  Philosophy  of  Science,  vol.   19  (1952),  pp. 

288-306. 


222  ADOLF   GRUNBAUM 

[12]    ,  Carnap's  views  on  the  foundations  of  geometry.  In:  The  Philosophy  of 

Rudolf  Carnap  (edited  by  SCHILPP,  P.  A.),  (forthcoming). 
[13]    ,  Geometry,  Chronometry  and  Empiricism.  In:  Minnesota  Studies  in  the 

Philosophy  of  Science  (edited  by  FEIGL,  H.  and  MAXWELL,  G.),  vol.  Ill 

(forthcoming). 

[14]    HERTZ,  H.,  The  Principles  of  Mechanics.  New  York,  1956,  271  pp. 
[15]    HOBSON,  E.  W.,  The  Theory  of  Functions  of  a  Real  Variable,  vol.  1.  New  York, 

1957,  XV  +  736  pp. 
[16]    MENGER,  K.,  On  algebra  of  geometry  and  recent  progress  in  non-euclidean 

geometry.  The  Rice  Institute  Pamphlet,  vol.  27  (1940),  pp.  41-79. 
[17]  MILNE,  E.  A.,  Kinematic  Relativity.  Oxford,  1948,  VI  +  238  pp. 
[18]  NEWTON,  I.,  Principia  (edited  by  CAJORI,  F.).  Berkeley,  1947,  XXXV  + 

680  pp. 
[19]     PAP,  A.,  Are  physical  magnitudes  operationally  definable ?  In:  Measurement'. 

Definitions  and  Theories  (edited  by  CHURCHMAN,  C.  W.  and  RATOOSH,  P.) 

New  York,  1959  (in  press). 

[20]    POINCARE,  H.,  The  Foundations  of  Science.  Lancaster  1946,  XI  -f-  553  pp. 
[21]    REICHENBACH,  H.,  The  philosophical  significance  of  the  theory  of  relativity.  In: 

Albert-Einstein:  Philosopher-Scientist  (edited  by  SCHILPP,  P.  A.)  Evanston, 

1949,  pp.  287-311. 

[22]    ,  The  Rise  of  Scientific  Philosophy.  Berkeley,  1951,  XI  +  333  pp. 

[23]    ,  The  Philosophy  of  Space  and  Time.  New  York,  1958,  XVI  -f  295  pp. 

[24]    RIEMANN,  B.,  Gesammelte  Mathematische  Werke  (edited  by  WEBER  and  DEUE- 

KIND).  New  York,  1953,  X  H    558  pp. 
[25]    ROBERTSON,  H.  P.,  Geometry  as  a  branch  of  physics.  In:  Albert  Einstein: 

Philosopher-Scientist  (edited  by  SCHILPP,  P.  A.).  Evanston,  1949,  pp.  313-332. 
[26]    RUSSELL,  B.,  Sur  les  axiomes  de  la  geometrie.  Revue  de  Mtftaphysique  et  de 

Morale,  vol.  7  (1899),  pp.  684-707. 

[27] ,  Our  Knowledge  of  the  External  World.  London,  1926,  251  pp. 

[28]    ,  The  Foundations  of  Geometry.  New  York,  1956.  201  pp. 

[29]    SALECKER,  H.  and  WIGNER,  E.  P.,  Quantum  Limitations  of  the  Measurement  of 

Space-Time  Distances.  The  Physical  Review,  vol.  109  (1958),  pp.  571-577. 
[30]    SOKOLNIKOFF,   I.   S.,   Mathematical  Theory  of  Elasticity.  New  York,    1946, 

XI  +  373  pp. 
[31]    STRUIK,  D.  J.,  Classical  Differential  Geometry.  Cambridge,  1950,  VIII  -f  221 

pp. 
[32]    TIMOSHENKO,  S.  and  GOODIER,  J.  N.,  Theory  of  Elasticity.  New  York,  1951, 

XVIII    |.  506  pp. 
[33]    VON  HKLMHOLTZ,  H.,  Schriften  zur  Erkenntnistheorie  (edited  by  HERTZ,  P. 

and  SCHLICK,  M.).  Berlin,  1921,  IX  +  175  pp. 
[34]    WHITEHEAD,  A.  N.,  The  Concept  of  Nature.  Cambridge,  1926,  VIII  -f  202  pp. 

[35]    t  The  Principle  of  Relativity.  Cambridge,  1922,  XII  +  190  pp. 

[36]    ,  Process  and  Reality.  New  York,  1929,  XII  +  546  pp. 

[37]    WIGNER,  E.  P.,  Relativistic  invariance  and  quantum  phenomena.  Reviews  of 

Modern  Physics,  vol.  29  (1957),  pp.  255-268. 


PART  II 
FOUNDATIONS  OF  PHYSICS 


Symposium  on  the  Axiomatic  Method 


HOW  MUCH  RIGOR  IS  POSSIBLE  IN  PHYSICS? 

P.  W.  BRIDGMAN 

Harvard  University,  Cambridge,  Massachusetts,  U.S.A. 

Let  me  begin  by  saying  that  I  have  accepted  the  invitation  to  speak  to 
this  Symposium  on  the  Axiomatic  Method  with  extreme  hesitation. 
I  think  I  realize  that  there  is  a  highly  developed  axiomatic  technique  and 
that  to  many  of  you  the  questions  of  greatest  interest  in  this  field  are 
questions  of  technique.  To  an  outsider  like  myself  the  spectacle  of  the 
virtuosity  exhibited  by  some  of  you  in  the  practise  of  this  technique  is  a 
little  terifying.  I  realize  that  many,  if  not  all  of  you,  will  be  impatient 
with  the  generalities  which  I  have  to  offer  and  will  be  eager  to  get  on  with 
the  more  vital  business  of  detailed  attack  on  the  numerous  technical 
problems.  I  cannot  even  hope  that  my  generalities  will  not  seem  to  you 
too  obvious  to  be  worth  saying,  and  that  I  may  appear  in  the  light  of  an 
enfant  terrible,  blurting  out  the  things  that  everyone  knows  but  has  too 
much  sense  to  say  out  loud.  If,  in  spite  of  all  this,  I  am  venturing  to  talk 
to  you,  it  is  partly  selfish  because  it  appeared  that  I  could  not  otherwise 
attend  this  meeting,  and  I  expect,  in  spite  of  your  technicalities,  to  pick 
up  points  of  view  which  will  be  new  and  profitable.  But  beyond  this,  I  do 
think  that  it  is  worth  while,  occasionally,  to  say  the  obvious  things  out 
loud,  for  I  do  not  believe  that  we  have,  even  yet,  taken  into  account  all 
the  obvious  things.  In  any  event,  I  am  glad  that  the  program  committee 
put  my  paper  in  the  opening  session,  so  that  you  can  soon  get  it  out  of  the 
way  and  turn  to  more  interesting  and  pressing  matters. 

The  "rigor"  which  I  shall  talk  about  is  not  itself  a  very  precise  or 
rigorous  thing.  In  its  first  usage  "rigor"  is  applied  to  reasoning.  If, 
however,  rigorous  reasoning  is  to  be  possible,  the  objects  and  operations 
of  our  reasoning  must  have  certain  properties,  so  that  "rigor"  comes  to 
have  an  extended  meaning.  In  this  extended  meaning  it  implies  sharpness 
and  precision  and  it  has  overtones  of  certainty.  It  is  in  this  extended 
sense  that  I  shall  be  concerned  with  rigor.  My  task  will  be  to  examine  to 
what  extent  what  we  do  in  physics  can  have  the  attributes  of  sharpness, 
precision,  and  certainty.  I  shall  assume  as  not  needing  argument  that  in 
no  field  of  activity  are  these  attributes  actually  attainable,  but  they 

225 


226  P.    W.    BRIDGMAN 

function  only  as  limiting  ideals,  which  are  never  fully  attained  even  in  as 
abstract  a  domain  as  that  of  postulate  theory. 

All  human  enterprise,  of  which  postulate  theory  and  physics  are 
special  cases,  is  subject  to  one  restriction  on  any  attainable  sharpness  or 
certainty  which  is  so  ubiquitous  and  unavoidable  that  we  seldom  bother 
even  to  mention  it.  The  possibility  of  self-doubt  is  always  with  us;  we 
can  always  ask  ourselves  whether  we  are  really  doing  what  we  think  we 
are  doing  or  how  we  can  be  sure  that  we  have  not  suddenly  gone  insane 
or  are  not  dreaming.  All  our  intellectual  activity  not  only  is,  but  has  to 
be,  based  on  the  premise  that  intellectually  we  are  going  concerns.  In 
so  far  as  this  is  common  to  postulate  theory  and  the  physics  of  the 
laboratory  I  need  not  stop  to  elaborate  the  point  further.  It  seems  to  me, 
however,  that  there  are  points  here  which  in  another  context  might  be 
analyzed  further  than  they  usually  are.  Just  what  is  involved  in  the 
assumption  that  I  am  a  going  concern  intellectually  ?  and  how  shall  I  go 
to  work  to  assure  myself  that  the  assumption  actually  applies  to  me? 
In  particular  what  is  the  method  by  which  I  can  assure  n^self  that  I  am 
not  now  dreaming  ?  I  have  seen  no  such  method. 

Forgetting  now  any  lack  of  sharpness  arising  from  self  doubt,  there  are 
certain  human  activities  which  apparently  have  perfect  sharpness.  The 
realm  of  mathematics  and  of  logic  is  such  a  realm,  par  excellence.  Here  we 
have  yes-no  sharpness  —  two  numbers  are  cither  equal  to  each  other 
or  they  are  not ;  a  certain  point  either  lies  on  a  given  line  or  it  does  not ; 
there  is  only  one  straight  line  connecting  any  two  points.  Now  it  is  a 
matter  of  observation  that  this  yes-no  sharpness  is  found  only  in  the 
realm  of  things  we  say  as  distinguished  from  the  realm  of  things  we  do. 
Sharpness  is  an  attribute  of  the  way  we  talk  about  our  experience,  in 
particular  whether  we  talk  about  it  in  yes-no  terms,  rather  than  an 
attribute  of  the  experience  itself,  if  you  will  be  charitable  enough  to 
grant  me  meaning  in  such  a  way  of  expression.  Nothing  that  happens  in 
the  laboratory  corresponds  to  the  statement  that  a  given  point  is  either 
on  a  given  line  or  it  is  not. 

There  is  no  question  but  that  we  do  talk  about  aspects  of  experience  in 
yes-no  terms,  and  in  so  far  as  any  field  of  experience  has  such  yes-no 
sharpness  it  has  it  in  virtue  of  the  fact  that  it  is  a  verbal  activity.  One  may 
well  question,  however,  whether  we  have  any  right  to  ascribe  such  yes-no 
properties  to  any  verbal  activity.  What  are  these  words  anyhow?  They 
are  not  static  things,  but  are  themselves  a  form  of  activity  which  varies  in 
some  way  with  every  so-called  repetition  of  the  word.  A  word  as  we  use 


HOW  MUCH  RIGOR  IS  POSSIBLE  IN  PHYSICS?  227 

it  is  part  of  a  terribly  complicated  system,  involving  both  present  struc- 
ture in  the  brain  and  the  past  experience  of  the  brain,  most  of  which  we 
cannot  possibly  be  conscious  of.  The  assumption  that  we  are  going  con- 
cerns intellectually  involves  much  more  than  merely  the  absence  of  self 
doubt. 

The  physics  of  measurement  and  of  the  laboratory  does  not  have  the 
yes-no  sharpness  of  mathematics,  but  nevertheless  employs  conventional 
mathematics  as  an  indispensible  tool.  Every  physicist  combines  in  his 
own  person,  to  greater  or  less  degree,  the  experimental  physicist  who 
makes  measurements  in  the  laboratory,  and  the  theoretical  physicist  who 
represents  the  results  of  the  measurements  by  the  numbers  of  mathe- 
matics. These  numbers  are  things  that  he  says  or  writes  on  paper.  The 
jump  by  which  he  passes  from  the  operations  of  the  laboratory  to  what 
he  mathematically  says  about  the  operations  is  a  jump  which  may  not  be 
bridged  logically,  and  is  furthermore  a  jump  which  ignores  certain  es- 
sential features  of  the  physical  situation.  For  the  mathematics  which 
the  physicist  uses  does  not  exactly  correspond  to  what  happens  to  him. 
In  the  laboratory  every  measurement  is  fuzzy  because  of  error.  As  far  as 
reproducing  what  happens  to  him  is  concerned,  the  mathematics  of  the 
physicist  might  equally  well  be  the  mathematics  of  the  rational  numbers, 
in  which  such  irrationals  as  -y/2  or  pi  do  not  occur.  Now  one  would  cer- 
tainly be  going  out  of  one's  way  to  attempt  to  force  theoretical  physics 
into  a  straight  jacket  of  the  mathematics  of  the  rational  numbers  as 
distinguished  from  the  mathematics  of  all  real  numbers,  but  by  forcing  it 
into  the  straight  jacket  of  any  kind  of  mathematics  at  all,  with  its  yes-no 
sharpness,  one  is  discarding  an  essential  aspect  of  all  physical  experience 
and  to  that  extent  renouncing  the  possibility  of  exactly  reproducing  that 
experience.  In  this  sense,  the  commitment  of  physics  to  the  use  of  mathe- 
matics itself  constitutes,  paradoxically,  a  renunciation  of  the  possibility 
of  rigor. 

The  unavoidable  presence  of  error  in  any  physical  measurement  which 
we  are  here  insisting  on  reminds  one  of  the  fuzziness  in  the  measurement 
of  conjugate  quantities  covered  by  the  Heisenberg  principle  of  inde- 
termination,  but  is,  I  believe,  something  quite  different.  The  sort  of  error 
that  we  here  are  concerned  with  would  still  be  present  in  our  knowledge  of 
the  so-called  "pure  case"  of  quantum  mechanics.  In  so  far  as  quantum 
theory  treats,  for  example,  the  charge  on  the  electron  or  Planck's  constant 
as  mathematically  sharp  numbers,  as  it  does,  it  is  in  so  far  neglecting  an 
essential  aspect  of  all  our  experience.  It  used  to  be  thought  that  the  errors 


228  P.    W.    BRIDGMAN 

of  physical  measurement  were  a  more  or  less  irrelevant  epiphenomenon, 
which  could  be  avoided  in  the  limit  by  the  construction  of  better  and  bet- 
ter measuring  apparatus.  This  happy  conviction  appeared  less  compelling 
when  the  atomic  structure  of  all  matter  was  established,  including  the 
atomic  structure  of  the  measuring  apparatus.  Now,  it  appears  to  me,  the 
linkage  of  error  with  every  sort  of  physical  measurement  must  be  re- 
garded as  inevitable  when  it  is  considered  that  the  knowledge  of  the 
measurement,  which  is  all  we  can  be  concerned  with,  is  a  result  of  the 
coupling  of  the  external  situation  with  a  human  brain.  Even  if  we  had 
adequate  knowledge  of  the  details  of  this  coupling  we  admittedly  could 
not  yet  use  this  knowledge  informulating  in  detail  how  the  unavoidable 
fuzziness  should  be  incorporated  in  our  description  of  the  world  nor  how 
we  should  modify  our  present  use  of  mathematics.  About  the  only  thing 
we  can  do  at  present  is  to  continue  in  our  present  use  of  mathematics, 
but  with  the  addition  of  a  caveat  to  every  equation,  warning  that  things 
are  not  quite  as  they  seem. 

Quantum  theory  has  effectively  called  to  attention  certain  other 
important  features  of  the  world  about  us.  The  realm  in  which  quantum 
effects  are  usually  considered  to  be  important  is  in  the  first  instance  the 
realm  of  small  things  —  small  distances  and  short  times.  Phenomena  in 
this  realm  do  not  present  themselves  directly  to  our  unaided  senses,  but 
occur  only  in  conjunction  with  special  types  of  instrument,  with  which  we 
say  that  we  "extend"  the  scope  of  our  senses.  But  if  we  examine  what  we 
actually  do,  we  see  that  these  instruments  function  through  our  con- 
ventional senses.  Hence,  it  does  not  reproduce  what  actually  happens  to 
say,  for  example,  that  the  microscope  reveals  to  us  a  new  ''microscopic 
world".  The  so-called  microscopic  world  is  really  a  new  macroscopic 
world  which  we  have  found  how  to  enter  by  inventing  new  kinds  of 
macroscopic  instrument.  The  "world"  of  quantum  phenomena  eventually 
has  to  find  its  description  and  explanation  in  terms  of  the  things  that 
happen  to  us  on  the  macroscopic  scale  of  every  day  life.  I  think  most 
quantum  theorists  will  admit  this  if  they  are  pressed,  but  in  spite  of  this 
the  language  of  ordinary  quantum  theory  is  a  language  of  microscopic 
entities  which  we  handle  verbally  just  as  if  they  had  the  existential  status 
of  the  objects  of  daily  life.  There  is  ample  justification  for  this  in  the 
enormous  simplification  which  results  in  our  description  and  our  handling 
of  experience.  This  simplification  is  nevertheless  bought  at  a  price  —  the 
price  of  neglecting  and  forgetting  some  of  the  unavoidable  accompani- 
ments of  all  our  experience.  By  thus  agreeing  to  blur  some  of  the  recog- 


HOW  MUCH  RIGOR  IS  POSSIBLE  IN  PHYSICS?  229 

nizable  aspects  of  experience  we  have  at  the  same  time  condemned 
ourselves  to  a  loss  of  possible  rigor,  using  "rigor1 '  with  the  implications 
already  explained.  This  sort  of  thing  is  by  no  means  characteristic  ex- 
clusively of  quantum  theory  —  strictly  we  should  never  think  of  bacteria 
without  thinking  of  microscopes  or  think  of  galaxies  without  thinking  of 
telescopes,  but  such  rigor  of  thought  is  hardly  attainable  in  practise. 

Another  matter  which  quantum  theory  has  forcibly  called  to  our 
attention  is  that  the  instrument  of  observation  may  not  properly  be 
separated  from  the  object  of  observation.  Heisenbcrg's  principle  of  in- 
determination  is  one  of  the  consequences  of  following  out  the  implications 
of  this.  The  principle  that  instrument  of  observation  is  not  to  be  separated 
from  object  of  observation  is,  it  seems  to  me,  a  special  case  of  a  broader 
principle,  namely  that  experience  has  to  be  taken  as  a  whole  and  may 
not  be  analyzed  into  pieces.  In  other  words,  the  operation  of  isolation  is 
not  a  legitimate  operation.  Now  the  operation  of  isolation  is  perhaps  the 
most  universal  of  all  intellectual  operations,  and  without  it  rational 
thought  would  hardly  be  possible.  Nevertheless,  in  the  world  of  quantum 
phenomena  situations  arise  in  which  our  propensity  for  isolating  defi- 
nitely gets  us  into  trouble.  For  instance,  the  electron  is  not  properly  to  be 
thought  of  in  isolation,  but  only  as  an  aspect  of  the  total  experimental 
set-up  in  which  it  appears.  When  we  view  the  electron  in  this  light  the 
paradox  disappears  from  such  situations  as  the  interference  pattern 
formed  by  the  electron  in  the  presence  of  two  slits,  where  the  electron, 
if  we  treat  it  as  an  ordinary  isolatable  object  that  can  go  through  only  one 
of  the  slits,  apparently  "knows"  of  the  existence  of  the  other  slit  without 
going  through  it.  We  are  thus  driven  to  concede  that  the  operation  of 
isolation  cannot  be  legitimate  "in  principle",  but  this  concession  presents 
us  with  an  extraordinarily  difficult  dilemma,  for  the  very  words  in  which  we 
express  the  illegitimacy  of  the  operation  of  isolation  receive  their  meaning 
only  in  a  context  of  isolation.  In  practise  we  meet  the  situation  as  best 
we  can  by  methods  largely  intuitive  in  character  which  we  have  acquired 
by  long  practise.  But  I  think  that  even  our  best  practise  has  disclosed  no 
method  of  sharply  handling  the  situation  —  the  method  of  isolation  is 
neither  sharply  separated  from  the  method  of  holism,  nor  is  there  any 
sharp  criterion  which  determines  when  we  shall  shift  from  the  one  method 
to  the  other.  Neither  is  it  possible  to  express  sharply  in  language  what  we 
mean  by  the  one  as  distinguished  from  the  other.  The  best  we  can  do 
in  practise  is  a  sort  of  spiralling  approximation,  shifting  back  and  forth 
from  one  level  of  operation  ot  the  other,  and  concentrating  our  attention 


230  P.    W.    BRIDGMAN 

first  on  one  aspect  and  then  on  another  of  the  total  situation.  In  such  a 
setting  we  cannot  expect  rigor. 

There  are  many  other  situations  in  which  the  operation  of  isolation 
leads  to  dilemma  and  paradox.  Long  ago,  on  the  classical  level,  the 
concepts  of  thermodynamics  found  their  meaning  in  terms  of  operations 
performed  on  isolated  systems.  Not  only  do  the  fundamental  concepts  of 
energy  and  entropy  receive  their  meaning  in  terms  of  physical  systems 
isolated  in  space,  but  isolation  in  time  is  also  required,  because  otherwise 
reversibility,  or,  more  generally,  recoverability  of  previous  condition, 
does  not  occur.  Without  recoverability  the  concepts  of  thermodynamics 
are  incapable  of  definition.  This  necessity  for  isolation  in  the  fundamental 
definitions  leads  to  logical  difficulties  when  we  attempt  to  extend  the 
notions  of  energy  or  entropy  to  the  universe  as  a  whole.  The  logical  status 
of  any  theorem  involving  the  conservation  of  the  energy  of  the  universe, 
or  the  universal  degradation  of  energy  and  eventual  heat  death  of  the 
universe,  seems  to  me  exceedingly  obscure.  Furthermore,  the  classical 
connection  between  deterministic  and  statistical  mechanics  which  ex- 
presses entropy  in  probabilistic  terms  seems  to  me  to  involve  an  ille- 
gitimate treatment  of  the  entropy  of  isolated  bodies.  It  is  often  said  that 
an  isolated  system  comprising  many  molecules  approaches,  with  the 
passage  of  time,  a  completely  disordered  state  and  hence  the  condition  of 
maximum  entropy,  because  of  the  "law  of  large  numbers",  in  virtue  of 
which  the  internal  condition  eventually  becomes  one  of  molecular  chaos 
in  spite  of  the  fact  that  the  laws  of  the  individual  molecular  encounters 
are  completely  deterministic.  This  it  seems  to  me  is  logically  fallacious. 
Given  an  isolated  system,  with  a  definite  initial  distribution  and  deter- 
ministic individual  encounters,  logically  it  can  never  evolve  into  a  system 
with  chaotic  distribution.  To  say  that  chaos  gets  in  through  the  operation 
of  the  "law  of  large  numbers"  seems  to  me  to  introduce  a  completely 
unjustified  and  ad  hoc  concept.  But  chaos  may  logically  get  into  the 
system  through  the  walls  which  are  coupled  to  the  external  world.  This 
coupling  is  part  of  a  divergent  process  —  the  state  of  the  walls  may  not 
be  deterministically  specified  except  by  coupling  them  to  an  ever  in- 
creasing domain  of  the  external  world  over  which  we  have  ever  less 
control.  The  only  acceptable  method  which  has  been  found  for  dealing 
with  this  divergent  process  is  through  probability.  Here  again  we  have 
paradox  —  the  concept,  entropy,  is  applicable  only  in  a  context  of  isolated 
systems,  but  the  detailed  mechanism,  through  the  operation  of  which 
entropy  functions,  occurs  only  in  non-isolated  systems. 


HOW  MUCH  RIGOR  IS  POSSIBLE  IN  PHYSICS?  231 

In  general  it  seems  to  me  that  the  situations  contemplated  in  proba- 
bility analysis  are  particularly  situations  in  which  the  jump  from  theory 
to  application  cannot  be  made  sharply,  so  that  no  application  of  proba- 
bility theory  can  be  rigorous.  It  is  particularly  important  to  realize  this 
now  that  quantum  theory  is  disposed  to  regard  probabilities  as  something 
fundamental  and  unanalyzable  rather  than  as  an  artefact  in  an  es- 
sentially deterministic  universe.  Against  this  tendency  of  the  theoretical 
physicists  must  be  placed,  I  believe,  the  realization  that  the  fundamental 
concepts  of  probability  have  meaning  only  in  the  context  of  a  determi- 
nistic background.  No  situation  is  ever  completely  chaotic,  but  it  is  only 
restricted  aspects  which  are  probabilistic.  We  cannot  say  that  a  particular 
fall  of  a  die  is  undetermined  and  probabilistic  unless  the  die  itself,  the 
table  top  on  which  it  rolls,  and  we  ourselves  who  observe  it  and  talk 
about  it,  retain  their  conventional  deterministic  identity.  We  have  here 
a  special  case  of  the  theorem  that  eventually  any  new  concepts  must  find 
their  meaning  on  the  level  of  daily  life.  And  since  the  level  of  daily  life  is 
preponderantly  deterministic,  I  believe  it  is  impossibel  to  handle  proba- 
bility consistently  as  ultimate  and  unanalyzable. 

"Randomness"  is  a  concept  fundamental  in  probability  analysis  and 
of  such  importance  that  lists  of  random  numbers  are  often  printed  and 
employed  in  practical  applications.  Yet  theoretically  no  finite  set  of 
numbers  can  be  completely  random,  because  there  are  an  infinite  number 
of  conditions  of  randomness.  In  practise,  no  set  of  numbers  that  has  been 
printed  or  otherwise  actually  exhibited  can  possibly  be  random,  nor  can  a 
series  of  events  that  has  actually  occurred  be  random,  because  it  is  always 
possible  to  find  some  sort  of  regularity  in  any  finite  sequence.  The  con- 
cept of  randomness,  so  fundamental  to  the  whole  conceptual  edifice,  thus 
appears  as  a  loose  concept,  incapable  of  realization  in  practise.  "Random- 
ness" occurs  only  in  the  realm  of  things  we  say. 

Probability  theory  runs  into  other  sorts  of  difficulty  when  it  deals  with 
rare  events.  A  literal  application  of  kinetic  theory  and  statistical  mecha- 
nics yields  a  calculably  small  finite  probability  for  any  compound  event. 
Thus  there  is  a  finite  probability  that  if  we  watch  long  enough  we  shall 
some  day  see  a  pail  of  water  freeze  on  the  fire,  a  conclusion  that  Bertrand 
Russell  has  delighted  to  rub  in.  Or  consider  another  example  in  somewhat 
the  same  vein.  Suppose  that  I  have  measured  some  object  by  ordinary 
laboratory  procedures  and  find  it  to  be  1 .500  meters,  with  some  apparent 
uncertainty  in  the  last  millimeter.  Suppose  that  I  choose  to  report  this 
measurement  by  saying  that  the  length  of  the  object  is  between  1  and  2 


232  P.    W.    BRIDGMAN 

meters.  Then  probability  theory  states  that  there  is  some  probability  that 
this  statement  is  incorrect.  Now  it  seems  to  me  that  a  theory  which  makes 
these  two  statements,  about  the  freezing  water  and  the  error  of  my 
measurement,  is  a  theory  which  fails  to  agree  qualitatively  with  the 
nature  of  everyday  experience.  The  finite  probability  of  freezing  or  of 
error  is  a  property  of  our  mathematics,  not  of  the  situation  which  the 
mathematics  is  designed  to  describe,  and  in  thus  dealing  with  rare  events 
our  probability  analysis  reveals  itself  as  only  an  approximation.  In 
general,  it  seems  to  me  that  one  has  a  right  to  question  any  probability 
analysis  which  predicts  an  event  so  rare  that  it  has  not  yet  been  observed. 
One  might  even  venture  a  theorem  to  this  effect.  Such  a  putative  theorem 
receives  a  certain  justification  when  it  is  considered  that  the  prediction  of 
rare  events  involves  long  range  extrapolations,  which  would  demand  the 
establishment  of  the  fundamental  laws  of  mechanics  with  an  accuracy  far 
beyond  that  actually  attainable. 

Our  intellectual  difficulties  are  thus  not  peculiar  to  the  new  situations 
revealed  by  quantum  theory,  but  classical  physics  has  always  had  its 
share  of  difficulty  and  paradox.  Among  these  difficulties  may  be  men- 
tioned these  of  dealing  with  continuous  media.  The  equations  of  hydro- 
dynamics, for  instance,  purportedly  deal  with  continuous  media,  but  the 
variables  in  the  equations  refer  to  the  motion  of  "particles"  of  the  fluid, 
which,  whatever  other  properties  they  may  have,  at  least  have  the 
property  of  identifiability.  Whatever  it  is  that  bestows  the  identifiability 
would  seem  to  violate  the  presumptive  perfect  homogeneity  and  conti- 
nuity of  the  fluid.  The  two  concepts  are  mutually  contradictory  and 
exclusive,  but  nevertheless  our  thinking  seems  to  demand  them,  and  as 
far  as  I  know  no  one  has  invented  a  way  of  getting  along  without  them. 

I  believe  that  there  are  somewhat  similar  difficulties  with  the  concept 
of  "field"  which  by  many  is  regarded  as  fundamental  to  modern  theo- 
retical physics.  We  think  of  the  field  at  any  point  of  space  as  something 
"real",  independently  of  whether  there  is  an  instrument  at  the  point  to 
measure  it.  But  when  we  try  to  account  mathematically  for  the  fact  that 
our  instrument  apparently  responds  to  what  was  there  before  we  went 
there  with  the  instrument,  we  find  that  actually  the  instrument  responds 
to  the  modified  state  of  affairs  after  the  instrument  is  introduced.  (This  is 
shown  by  an  analysis  of  the  Maxwell  stresses.)  Our  attempt  to  give 
instrumental  meaning  to  something  that  exists  in  the  absence  of  the 
instrument  seems  foredoomed  to  failure  —  one  can  detect  the  odor]  of 
a  logical  inconsistency  here.  Yet  our  thinking  seems  to  demand  that  we 


HOW  MUCH  RIGOR  IS  POSSIBLE  IN  PHYSICS?  233 

attach  a  meaning  to  what  would  be  there  in  the  absence  of  the  instrument, 
whereas  meaning  itself  exists  only  in  a  context  of  instruments. 

All  the  infelicities  and  ineptnesses  which  we  have  encountered  up  to 
now  arise  because  we  have  been  trying  to  do  something  with  our  minds 
which  cannot  be  done.  After  long  experience  we  have  found  how  to  deal 
with  situations  of  this  sort  after  a  fashion.  We  push  the  conventional  line 
of  attack  as  far  as  we  can,  and  when  we  presently  run  into  conceptual 
difficulties,  we  usually  meet  these  difficulties,  not  by  any  drastic  revision 
of  our  conceptual  structure,  but  by  keeping  as  much  of  it  as  we  can  and 
patching  it  up  by  rules  explicitly  warning  of  the  limitations  of  the  con- 
ventional machinery.  There  is  a  certain  resemblance  between  this  general 
situation  and  the  special  situations  in  quantum  theory  to  which  the  Hei- 
senberg  principle  is  applicable.  We  cannot,  for  example,  push  our  con- 
ventional description  of  a  physical  system  in  terms  of  space  and  time  too 
far  toward  the  microscopic  without  running  into  difficulties  with  our 
description  of  the  same  system  in  terms  of  cause  and  effect,  although  on 
the  scale  of  daily  life  a  description  in  terms  of  space  and  time  is  practically 
synonymous  with  a  description  in  terms  of  cause  and  effect.  There  are 
many  examples  in  quantum  theory  where  we  have  to  decide  between 
which  of  two  mutually  exclusive  forms  of  description  we  shall  employ. 
Bohr  sees  all  these  as  examples  of  the  principle  of  complementarity,  but 
he  regards  this  principle  as  something  of  much  broader  scope  and  of 
deeper  philosophical  significance  than  as  merely  a  principle  limited  in  its 
application  to  physical  systems.  Thus  he  speaks  of  the  impossibility  of 
reconciling  the  demands  of  justice  and  mercy,  and  the  presumptive 
impossibility  of  making  a  physical  analysis  of  biological  systems  suf- 
ficiently searching  to  disclose  the  nature  of  life  without  destroying  that 
life,  as  examples  of  the  general  principle  of  complementarity.  It  seems  to 
me  that  it  did  not  need  quantum  theory  to  disclose  this  general  situation, 
but  that  we  have  always  had  situations  where  we  have  been  forced  to 
shift  to  another  line  of  attack  when  we  push  our  analysis  to  the  logical 
limit.  In  other  words,  the  method  of  "yes-but"  we  have  always  had  with 
us.  It  seems  to  me  that  the  generalized  principle  of  complementarity  is 
merely  a  glorified  version  of  the  principle  of  "yes-but".  The  method  of 
"Yes-but"  goes  back  at  least  to  the  time  of  Zeno,  who,  I  will  wager,  was 
as  capable  as  the  next  man  of  catching  the  tortoise  which  he  intended  to 
convert  into  stew  for  dinner,  in  spite  of  his  paradoxes  of  motion.  This 
sort  of  thing  it  seems  to  me  is  too  ubiquitous  and  too  vague  to  warrant 
our  seeing  here  the  operation  of  some  grandiose  "principle",  nor  do  I 


234  P.    W.    BRIDGMAN 

believe  that  it  materially  increases  the  presumptive  truth  of  quantum 
theory  to  have  discovered  this  sort  of  qualitative  situation  concealed 
in  the  consequences  of  its  analysis.  In  fact,  if  it  had  not  found  this  sort 
of  thing  it  would  be  presumptive  evidence  against  it.  These  strictures 
must  not  be  taken  as  in  any  way  reflecting  on  the  validity  of  the  numerical 
relationships  demanded  by  quantum  theory  —  these  are  an  entirely 
different  sort  of  thing. 

Whatever  view  we  take  of  complementarity  as  a  grandiose  principle  of 
sweeping  applicability,  it  seems  obvious  to  me  that  here  we  have  a  factor 
militating  against  sharpness,  for  the  line  separating,  for  example,  a 
legitimate  space-time  description  from  a  deterministic  description  cannot 
be  sharply  drawn.  Whenever  we  encounter  such  a  lack  of  sharpness  we 
may  anticipate  also  a  failure  of  the  possibility  of  rigor. 

All  the  situations  which  we  have  encountered  thus  far  have  a  feature 
in  common.  In  all  of  them  we  have  encountered  failures  of  our  intellectual 
machinery  to  deal  with  experience  as  we  obviously  would  like  to  have  it 
deal  —  in  particular,  our  intellectual  machinery  has  proved  itself  in- 
capable of  exactly  reproducing  what  we  see  happen.  For  instance,  our 
verbalizing,  or  our  mathematics,  which  is  the  same  thing,  has  no  built-in 
cut-off,  corresponding  to  error  or  to  the  finiteness  of  human  experience. 
In  addition  to  this  sort  of  failure  of  our  mental  machinery  to  exactly 
reproduce  features  of  experience  which  are  fairly  obvious  and  which  are 
often  explicitly  talked  about,  I  think  there  is  also  failure  for  reasons  not 
usually  appreciated  or  said  out  loud,  reasons  corresponding  to  demands 
we  ought  to  make  of  our  mental  machinery  but  which  in  fact  we  do  not. 

I  think  it  will  be  admitted  that  an  ideal  mental  machinery  will  not 
employ  the  operation  of  isolation  for  the  reason  that  isolation  does  not 
occur  in  actuality.  Quantum  theory  prohibits  the  isolation  of  the  object  of 
knowledge  from  the  instrument  of  knowledge,  and  successfully  analyzes 
the  situations  to  which  the  Heisenberg  principle  applies  in  terms  of  the 
reaction  between  instrument  and  object  which  are  ignored  when  they  are 
isolated  from  each  other.  But  any  actual  situation  involves  not  only 
instrument  of  knowledge  and  object  of  knowledge,  but  also  the  knower. 
Quantum  theory,  however,  consistently  neglects  the  knower.  Thus  I  find 
the  following  quotation  in  a  recent  lecture  by  Professor  Bohr :  '  In  every 
field  of  experience  we  must  retain  a  sharp  distinction  between  the  observer 
and  the  contents  of  the  observations/'  But  in  the  world  of  things  that 
happen  this  sort  of  distinction  does  not  occur,  and  in  making  the  distinc- 
tion it  seems  to  me  that  quantum  theory  practises  a  kind  of  isolation.  In 


HOW  MUCH  RIGOR  IS  POSSIBLE  IN  PHYSICS?  235 

physics  the  knower  is  always  there,  whether  I  am  concerned  with  myself 
practising  physics  or  whether  I  observe  other  people  practising  it.  It  may 
well  be  that  quantum  theory  is  justified  for  its  particular  purposes  in 
taking  the  knower  for  granted,  but  we,  in  so  far  as  we  are  committed  to 
the  problem  of  describing  and  understanding  the  total  scene,  may  not 
neglect  the  knower.  The  problem  of  getting  the  knower  into  the  picture 
has  become  acute  now  that  most  of  us  have  become  convinced  that  the 
knower  is  itself  a  physical  system.  Formerly,  when  people  could  think  of 
mental  activity  as  the  functioning  of  a  special  mind  stuff,  sui  generis,  and 
with  little  in  common  with  ordinary  matter,  it  did  not  appear  logically 
absurd  to  hope  to  give  an  account  of  the  one  kind  of  matter  independently 
of  the  properties  of  the  other.  But  now  we  are  convinced  that  mental 
activity  accurs  in  physical  structures  of  stupendous  complexily,  made  of 
the  same  atoms  that  the  activity  is  seeking  to  comprehend.  These  com- 
plexities, if  anything,  increase  the  urgency  of  understanding  the  nature  of 
the  coupling  between  the  structure  of  the  brain  and  the  external  world. 
The  presumption  that  there  is  some  sort  of  essential  limitation  because  of 
the  nature  of  the  structure  and  the  coupling  appears  irresistible. 

The  concepts  in  terms  of  which  we  describe  and  understand  the  world 
about  us  do  not  occur  in  nature,  but  are  man  made  products.  Such  things 
as  length,  or  mass,  or  momentum,  or  energy  occur  only  in  conjunction  with 
brains.  The  significance  of  these  concepts  cannot  be  isolated  and  as- 
sociated only  with  the  external  world,  but  the  significance  is  a  joint 
significance  involving  external  world  and  brain  together.  Now  it  seems 
to  me  that  it  is  quite  conceivable  that  different  properties  of  the  brain 
structure  are  involved  in  the  concept  of  length,  for  example,  than  are 
involved  in  the  concept  of  mass.  It  might  be  that  the  concept  of  mass  is 
beyond  the  powers  of  certain  simple  types  of  brain  whereas  the  concept  of 
length  might  be  easily  within  them.  If  such  were  the  case,  or  if  our  present 
brain  structure  carries  vestiges  of  limitations  of  this  sort,  our  outlook 
might  be  materially  altered. 

To  completely  answer  the  questions  brought  up  by  considerations  of 
this  sort  not  only  should  we  be  able  to  hold  ourselves  to  an  awareness  of 
the  indisoluble  tie-in  of  brain  structure  with  the  external  world,  but  we 
should  be  able  to  describe  specifically  the  nature  of  this  tie-in  for  different 
concepts  such  as  mass  or  length.  We  are  at  present  hopelessly  far  from 
being  able  to  do  this,  or  even  from  knowing  whether  it  is  possible  "in 
principle".  There  is,  however,  something  which  we  can  now  do  which  has 
the  effect  of  shifting  the  center  of  gravity  away  from  the  unknown 


236  P.    W.    BRIDGMAN 

contribution  of  the  brain,  so  that  a  little  more  "objectivity"  can  be  im- 
parted to  our  physical  concepts.  If  one  examines  what  he  docs  when  he 
determines  such  physical  parameters  of  a  physical  system  as  its  mass  or 
its  energy,  it  will  be  seen  that  the  procedure  involves  the  complicated 
interplay  of  operations  of  manipulation  in  the  laboratory  and  operations 
of  calculation.  It  is  into  these  latter  operations  of  calculation  that  the 
unknown  and  questionable  influence  of  brain  structure  enters.  Suppose 
now  that  we  define  the  energy  of  a  body,  not  as  the  number  which  is  ob- 
tained by  combining  in  a  certain  way  other  numbers  which  may  corre- 
spond to  velocity  and  mass,  but  as  the  number  which  is  automatically 
given  when  a  certain  type  of  instrument,  an  "energy  measurer"  is  coupled 
to  the  body.  Such  an  "energy"  is  more  something  that  we  do  and  less 
something  that  we  say  and  think  than  the  conventionally  defined  energy. 
If  we  are  clever  we  ought  to  be  able  to  design  instruments  which  would 
automatically  record  on  a  scale,  when  coupled  to  the  body,  any  of  the 
conventional  physical  parameters.  When  we  have  designed  such  instru- 
ments we  should  be  able  incidentally  to  discover  some  of  the  limitations 
in  the  measureability  of  energy,  for  instance,  whereas  it  would  be  hopeless 
to  expect  to  find  such  limitations  as  long  as  we  have  to  treat  the  limi- 
tations as  incidental  to  the  structure  of  the  brain. 

I  have  made  the  beginning  of  an  attempt  to  specify  in  detail  how 
instruments  might  be  constructed  which  would  automatically  register  on 
a  dial  this  or  that  physical  parameter  of  an  object  when  coupled  to  it. 
It  is  evident  that  the  instruments  will  fall  into  hierarchies,  the  higher 
members  of  the  hierarchy  employing  as  component  parts  the  complete 
instruments  of  lower  levels.  An  instrument  for  automatically  recording 
length  is  fairly  easy  to  construct,  whereas  I  found  it  to  require  great 
complication  to  construct  an  instrument  for  indicating  mass,  and  even  so 
there  appear  to  be  definite  limitations  on  such  features  as  speed  of  re- 
sponse. This  is  in  spite  of  the  fact  that  it  is  just  as  easy  to  say  mass  as  to 
say  length,  and  that  in  such  an  activity  as  dimensional  analysis  we  think 
of  mass  and  length  as  of  equal  simplicity.  When  we  define  mass  and  length 
instrumentally  in  this  way,  we  see  that,  because  of  its  greater  complexity, 
it  will  not  be  so  easy  to  apply  the  mass  measuring  instrument  to  small 
objects  as  the  length  measuring  instrument,  so  that  the  concept  of  mass  is 
subject  to  limitations  in  the  direction  of  the  very  small  to  which  the  con- 
cept of  length  is  not  subject.  This  sort  of  limitation  is  entirely  different 
from  the  sort  of  mutual  limitation  of  measurements  of  velocity  and  po- 
sition, for  example,  controlled  by  the  Heisenberg  principle  in  quantum 


HOW  MUCH  RIGOR  IS  POSSIBLE  IN  PHYSICS?  237 

mechanics.  It  suggests  itself  that  there  may  be  other  sorts  of  conceptual 
limitations  in  making  contact  with  the  world  than  those  treated  by  con- 
ventional quantum  mechanics. 

The  general  problem  back  of  all  these  later  considerations  is  for  the 
knower  to  know  himself.  It  has  been  recognized  as  a  fundamental  philo- 
sophical problem  since  at  least  the  time  of  Socrates,  but  it  appears  that 
we  have  not  got  very  far  toward  a  solution.  Recent  developments  make 
it  appear  that  the  solution  of  this  problem  is  more  difficult  than  was 
perhaps  at  one  time  optimistically  assumed.  For  we  have  here  a  self- 
reflexive  situation,  a  system  dealing  with  itself.  Godel's  theorem  shows 
that  in  the  case  of  at  least  one  special  type  of  such  a  system  there  are 
drastic  and  formerly  unsuspected  limitations.  It  does  not  appear  un- 
reasonable to  suppose  by  analogy  that  there  are  also  formidable  diffi- 
culties in  the  general  case.  I  believe  these  difficulties  appear  the  moment 
one  attempts  a  specific  attack  on  the  problem  —  in  fact  it  is  difficult  to 
even  formulate  what  the  problem  is  in  self  consistent  language.  It  seems 
to  me  that  nevertheless  the  problem  is  one  of  the  very  first  importance.  I 
think  what  I  have  said  here  makes  it  at  least  doubtful  whether  any 
possible  solution  can  be  rigorous  in  the  canonical  meaning  of  rigor  —  I 
believe  that  this  will  increase  the  difficulty  of  finding  an  acceptable  so- 
lution rather  than  decrease  it,  as  might  perhaps  at  first  seem  natural. 
Until  we  have  solved  the  problem,  I  do  not  believe  that  we  can  estimate 
what  the  limitations  arc  on  any  possible  rigor,  nor  even,  for  that  matter, 
know  what  the  true  nature  of  rigor  is. 


Symposium  on  the  Axiomatic  Method 


LA  FINITUDE  EN  MECANIQUE  CLASSIQUE,  SES 
AXIOMES  ET  LEURS  IMPLICATIONS 


ALEXANDRE  FRODA 

Academic  de  la  Rdpublique  Populaire  Roumaine,  Institut  de  Mathematiques,  Bucarest, 

Roumanie 

Le  monde  physique  ne  nous  revele  jamais,  du  nioins  a  notre  echelle  des 
grandeurs,  1'existence  actuelle  de  1'infini.  En  particulier,  Ton  ne  rencontre 
en  mecanique  classique  ni  forces  infinies,  ni  une  infinite  de  renversements 
du  sens  de  mouvement  d'un  mobile  materiel  en  un  laps  fini  de  temps. 
C'est-ce  qui  nous  a  suggere  Introduction  de  deux  nouveaux  axiomes  en 
mecanique  et  1'ctude  dc  leurs  implications.  Nous  les  avons  appele 
axiomes  de  finitude  (F). 

A  propos  dc  la  negation  de  1'infini  (en  mecanique  classique)  qui  a 
inspire  ccs  axiomes,  on  peut  citer  une  des  profondes  remarques  de  E. 
Mach  sur  revolution  de  la  mecanique  [8,  8;  34] :  ,,Un  des  caracteres  de  la 
connaissance  instinctive",  ecrivait-il,  ,,c'est  d'etre  surtout  negative.  Ce 
n'est  pas  predire  ce  qui  arrivera  que  nous  pouvons  faire,  mais  seulement 
dire  les  choses  qui  ne  peuvent  pas  arriver,  car  celles-ci  seules  contrastent 
violemment  avec  la  masse  obscure  des  experiences,  dans  laquelle  on  ne 
discerne  pas  le  fait  isole". 

On  fera  appcl  aux  nouveaux  axiomes  afin  d'eclaircir  une  question,  qui 
s'est  imposee  a  Tattention  des  physicicns,  cles  que  Wcierstrass  prouva 
T existence  de  fonctions  continues  sans  derivee.  Or  il  est  admis,  en  analyse, 
que  les  fonctions  non  derivables  ne  sont  nullement  except ionnelles  dans 
la  classe  des  fonctions  continues.  L'on  admet,  par  contre,  en  mecanique 
classique  que  tout  mouvement  possede  une  vitesse  et  une  acceleration, 
a  tout  instant,  ce  qui  implique  1'existence  des  derivees  pour  toutes  les 
fonctions  continues,  qui  definissent  analytiquement  les  mouvements. 

Soit 

?=?')  0) 

une  equation  vectorielle  definissant  la  cinematique  du  mouvement  JLL  d'un 
point  materiel  M  de  masse  m,  dans  un  laps  de  temps  d  =  [to,  t{\ .  II  y  est 
suppose,  selon  Newton,  que  /  mesure  physiquement,  a  partir  d'un  instant 

238 


LA   FINITUDE   EN    MECANIQUE   CLASSIQUE  239 

intial  t  =  to,  le  temps  ,,absolu"  et  que  Ic  vecteur  dc  position  r,  situe  le 
point  M  par  rapport  a  un  systemc  fixe  d'axes  cartesiennes  constituant  des 
reperes  de  1'espace  ,,absolu". 

L'on  pose,  en  mecanique  classiquc,  pour  la  vitesse  et  1'acceleration  du 
mobile  a  1'instant  t, 

^fi^  —  ^fi,    A(t)=~1(t),  (2) 

ce  qui  les  definit  aussi  comme  fonctions  vectorielles  de  t. 

On  y  admet  la  continuite  de  r(t),  qui  resulte  de  notre  intuition  du  temps 
et  du  mouvement.  Cette  assertion  ne  sera  pas  mise  en  discussion,  a  cette 
occasion.  _^ 

La  definition  (2)  de  v(t)  a  un  sens  en  mecanique  classique  par  cc  que  la 

fonction  vcctorielle  r(t)  y  est  supposee  derivable,  propriete  attribute  a 
tout  mouvement.  Afin  de  justifier  la  definition  (2)  de  A(t)t  il  y  est  admis, 

de  plus,  que  v (t)  est  non  seulement  continue,  mais  aussi  derivable,  par 
rapport  a  t,  quel  que  soit  t  e  6. 

Ainsi  tout  mouvement  //  present erait  en  mecanique  classique  des  carac- 
teres,  qui  ne  sont  demontrablcs  ,,ni  mathematiquement,  ni  empirique- 
ment",  comme  I'affirmait  G.  Hamel  [5a;  5b,  p.  64;  5c,  p.  2]  en  axiomati- 
sant  la  mecanique  rationnelle.  En  faisant  remarquer,  que  ,,aucune  ex- 
perience ne  serait  assez  fine  pour  descendre  jusqu'au  differentielles", 
Hamel  attribuait  1'existence  de  la  vitesse  et  de  Tacceleration  d'un  mouve- 
ment fi,  a  tout  instant  t  de  d,  a  un  principe  physique,  selon  lequel :  ,,Toutes 
les  grandeurs  observables  sont  continues  et  continument  differentiates". 
Un  principe  pareil  fut  affirme  par  L.  Zoretti  [10,  pp.  16,  17,  40],  dans  son 
etude  des  principes  de  la  mecanique  classique. 

Designons  par  Ra  un  axiome  affirmant  1'existence  d'une  acceleration 
— > 

A  (t)  a  tout  instant  t  d'un  mouvement  //,  du  domaine  C  de  valabilite  de  la 
mecanique  classique  (de  Newton),  ce  qui  implique  aussi  Texistence  d'une 

vitesse  v(t)  continue. 

II  nous  faudra  distinguer  entre  les  mouvements  a  definition  purement 
cinematique  (1)  et  les  mouvements  fjic  realisables  en  C,  ce  qui  exige  des 
definitions  explicites.  En  faisant  abstraction  des  eventuelles  resistances 
passives  (frottement  viscosite,  etc.)  d'un  mouvement  reel  fi  de  C,  Ton 


240  ALEXANDRE    FRODA 

fait  correspondre  a  pun  mouvement  [tc  ,, conservatif "*,  qui  est  soit  egal  a p, 
soit  —  lorsque  p  n'est  pas  conservatif  —  egal  a  la  limite  (cinematique)  de 
la  suite  des  mouvements  non-conservatifs  obtenus  en  faisant  tendre  suc- 
cessivement  les  resistances  passives  de  p  vers  zero.  Cela  est  considcre  pos- 
sible, sinon  experiment  alement,  du  moins  theoriquement. 

Soit  (R)  un  systeme  classique  d'axiomes  de  la  mecanique  rationnelle. 
Tout  mouvement  realisable  en  C  y  satisfait,  mais  la  reciproque  pourrait 
ne  pas  etre  vraie.  En  effet,  un  mouvement  quelconque  d'un  point  ma- 
teriel M  de  masse  m  ctant  donne  par  1'cquation  (1),  il  semble  douteux 
qu'un  tel  mouvement  soit  realisable,  quelle  que  hit  sa  definition  cine- 
matique. C'est  un  fait  que  signalait  deja  H.  Herz  [7,  p.  12],  dans  son  etude 
des  principes  de  la  mecanique.  Nous  aurons  a  revenir  tout  a  1'heure  sur  ce 
point  tout  aussi  important,  que  clelicat. 

Le  domaine  C  des  mouvements,  considered  en  mecanique  rationnelle 
du  temps  de  Newton,  fut  ulterieurement  rccluit  par  suite  de  la  critique 
des  principes  qu'il  avait  pose  a  la  base  de  sa  ,, philosophic  naturelle", 
critique  stimulee  par  les  progres  ulterieurs  de  la  physique.  Newton  avait 
admis  a  la  fois  les  principes  suivants:  1)  1'existence  cTun  temps  et  d'un 
espace  ,,absolus",  ainsi  que  la  Constance  ,,absolue"  de  la  masse  en  mou- 
vement et  2)  Le  mainticn  des  propriotes  de  la  matiere  j usque  dans  ses 
parties  ultimcs, ,, indivisibles".  De  ces  principes,  le  premier  est  aujourd'hui 
conteste  par  la  mecanique  de  la  relativite  generale,  le  second  par  la  me- 
canique quantique  (ondulatoire).  En  consequence,  le  domaine  C  de  la 
mecanique  rationnelle  est  limite  aujourd'hui  par  1'existcnce  de  ces  der- 
nieres  mecaniques.  C'est  pourquoi  nos  axiomes  de  finitude  s'applkmeront 
seulemcnt  a  la  mecanique  rationnelle,  sans  prejudice  de  leur  eventuelle 
extension  aux  mecaniques  nouvelles.  Or,  de  la  mecanique  du  point  on  est 
conduit  a  la  mecanique  des  systemes  en  vertu  d'axiomes  que  nous  n'allons 
pas  examiner. 

Signalons  toutefois  que  la  notion  de  point  materiel  pose  elle-meme 
des  questions.  Depuis  Euler  et  Lagrange  Ton  admet  souvent  en  mecani- 
que classique,  1'existencc  de  points  materials  M  aux  dimensions  nulles, 
mais  de  masse  m  non-nulle.  L.  Zoretti  [10]  les  appelle  ,,fictifs",  puis- 
que  Ton  y  neglige  les  proprietes  rotationnelles  d'un  corps  tres  petit. 
Mais  il  y  a  plus,  1'introduction  de  tels  points  peut  conduire  a  des  contra- 

1  Un  mouvement  sera  dit  conservatif,  par  definition,  s'il  ne  comporte  pas  de 
degradation  d'energic  (due  a  des  resistances  passives).  Pour  le  sens  different, 
classique,  attribue  a  I'expression  ,, systeme  conservatif",  voir  par  exemplc  Appell  P., 
Traite  de  mdcanique  rationnelle,  t.  II,  Ed.  4,  Paris  (1923),  p.  65  et  suivantes. 


LA   FINITUDE   EN   MECANIQUE   CLASSIQUE  241 

dictions.  Considerons  par  exemple,  abstraction  faite  des  resistances  pas- 
sives, le  mouvement  de  M  le  long  d'une  courbe  F  materielle,  plane,  verti- 
cale,  d'equation  y  =  x*1*,  ou  Oy  est  la  verticale.  Si  M  doit  etre  presse 
contre  JT,  de  son  cote  concave,  il  faut  que  M  soit  un  point  geometrique, 
puisque  le  rayon  de  courbure  p  egale  zero  au  sommet  de  1\  Or  si  la  vitesse 

v  n'y  etait  pas  nulle,  la  force  de  liaison  y  serait  infinie,  ce  qui  n'est  pas 
physiquement  realisable. 

L'on  peut  remarquer  d'ailleurs,  que  1'existence  d'un  point  materiel 
sans  dimensions  est  tout  aussi  critiquable,  que  1'admission  d'existence 
d'un  instant  reel  t  de  temps,  a  duree  nulle,  qui  peut,  de  meme,  conduire  a 
des  contradictions.  Ces  existences  physiquement  inconcevables  sont  par- 
fois  impliquees  par  1'application  de  la  methode  infinitesimale  en  physique, 
qui  devrait  —  semble-t-il  —  etre  1'objet  d'une  analyse  axiomatique,  assez 
difficile  a  faire. 

De  tellcs  objections  apparaissent  aussi  dans  les  mecaniques  nouvelles. 
Signalons  ainsi,  en  passant,  un  passage  significatif  oil  Heisenberg  en 
s'occupant  de  son  principe  d'incertitude  exprimait,  deja  en  1930,  des 
doutes  de  principe  sur  la  legitimate  d'attribuer  un  sens  physique  au  pas- 
sage a  la  limite  d'un  volume  ct  d'une  duree  elementaires,  lorsqu'il  s'agit 
d'evaluer  1'amplitude  d'un  champ  electrique  et  d'un  champ  magnetique 
en  mecanique  quantique  [  6,  p.  37]. 

Revenons  a  notre  problemc,  qui  est  celui  de  debarasser  les  axiomes  de  la 
mecanique  classique  de  l<  hypothhe  Ra,  definie  ci-dessus.  L'on  y  parviendra 
en  considerant  d'abord  dans  les  conditions  les  plus  generales  de  1' analyse 
les  grandeurs  vectorielles,  qui  intervienncnt  en  mecanique  et  en  recher- 
chant  ensuite  les  circonstances,  qui  imposcnt  1'existence  des  derivees, 
lorsque  les  mouvements  sont  realisables  en  C.  C'est  ce  que  nous  avons 
entrepris  dans  un  travail  anterieur,  en  roumain  [3,  p.  3-4J. 

Le  dcveloppement  du  programme  indique  exige  des  notions  de  cine- 
matiquc  generale,  que  1'on  definit  en  etcndant  aux  fonctions  vectorielles 
W(t)  d'une  variable  reelle  t  les  proprietes  classic^ues  des  fonctions  (nurne- 
riques)  reelles  de  t,  concernant  la  continuite,  les  borncs,  les  limites,  la  de- 
rivation et  1'integration.  II  nous  suffit  d'en  mentionner  1'analogie,  que 
reflete  la  terminologie  respective. 

Voici  enfin  quelques  definitions  de  cinematique  gencrale,  qui  nous  ser- 
viront  a  formuler  les  axiomes  de  la  mecanique  rationnelle.  Considerons,  de 

nouveau,  un  mouvement  ^  a  definition  cinematique  (1).  Par  definition,  la 

— > 
vitesse  v(t)  existe  a  1' instant  t,  si  pour  Ai  ->  0,  cela  a  un  sens  d'ecrire 


242  ALEXANDRE    FRODA 


W  =  lim    -       [>(*  +  JO  -(/)]  (3) 

J<->0    ^" 

et  cette  vitesse  sera  dite  complete.  De  meme,  il  existe  a  I'instant  t  une  vi- 

—  >  -> 

tesse  prospective  v+(t)  ou  retrospective  v-(t),  lorsque  la  limitc  en  (3)  a  un 
sens  pour  At  >  0  ou  pour  At  <  0,  respectivement.  Considcrons,  en  parti- 

culier,  le  cas  d'un  mouvemcnt  ^,  tel  que  v(t)  existe  et  soit  continue  a 

—  > 

chaque  instant  t  de  <5[/0,  <i].  Par  definition,  1'  acceleration  A(t)  existe  alors 
a  I'instant  t,  si  pour  At  ->  0,  cela  a  un  sens  d'ecrire 

7(Q  -  lim  —  [v(t  +  At)  -  v(t)]  (4) 

j/->o  ^ 

et  cette  acceleration  sera  dite  complete.  De  meme,  il  existe  a  I'instant  t 

une  acceleration  prospective  a(t)  ou  retrospective  <x.(t),  lorsque  la  limite  en 
(4)  a  un  sens  pour  At  >  0  ou  pour  At  <  0,  respectivement. 

Nous  verrons,  que  dans  le  domaine  C  cle  la  mecanique  rationnelle  la 

vitesse  complete  v(t)  existe  a  tout  instant  d'un  mouvement  realisable  en  C 
et  qu'il  y  a  done  alors 


C'est  pourquoi  il  est  inutile  de  definir  des  ,,  accelerations",  en  partant 

—  >  —  > 

des  vitesses  prospective  v+(t)  et  retrospective  v-(t). 

Aux  notions  cinematiques  precedentes  ajoutons  encore  les  definitions 
suivantes,  afin  d'abreger  le  langage: 

1)  On  dira  qu'un  mouvement  ju,  a  definition  cinematique  (1)  est  re- 

gulier,  en  un  laps  de  temps  d  —  [to,  t\\,  s'il  possede  a  chaque  instant  t  de  d 

~> 
une  acceleration  complete  A  (t)  continue.  _>     ^ 

2)  En  considerant  les  mouvements  /^  d'equation  r  —  r(t)  et  p,n  d'equa- 
—  >     —  > 

tions  r  =  rn(t),  n  =  1,2,  .  ,  .  ,  Ton  dira  que  jn  est  la  limite  cinematique  des 

—  >  —  > 

/iw,  pour  n  ->  oo,  lorsque  y(0  —  limyn(0. 

?1->CXD 

3)  On  dira  qu'un  vecteur  W7(0  change  d'  orientation  dans  I'espace  une 
infinite  de  fois  en  une  suite  indefinie  a  (croissante,  resp.  decroissante) 
d'instants  successifs  ti,  lorsqu'il  existe  un  axe  de  1'espace,  tel  que  les  pro- 


LA   FINITUDE   EN   MECANIQUE    CLASSIQUE  243 

->  —  > 

jections  des  W(ti)f  W(tt+i)  sur  cet  axe  aient  des  sens  opposes,  pour  une 
infinite  d'indices  parmi  les  i  =  1,2,  .... 

4)  Lorsque  dans  un  intervalle  6  =  [to,  t\\  des  mouvements  p\t  ^  sont 

—  >     —  >      —  >     —  > 
donnes  par  r  =  r\(t],  r  =  r2(t),  Ton  dira  que  le  mouvement  p,  donne  par 

r  =  r(t)  est  leur  mouvement  resultant,  s'il  y  a  r(t)  =  r\(t)  + 


5)  Appelons  polynomial  un  mouvement  r  =  r(t),  oil  r(t)  est  un  polynome 
en  ty  dont  les  coefficients  sont  des  vecteurs  constants  de  1'espace.  Un  tel 
mouvement  est  regulier. 

Ajoutons  la  remarque  suivante:  Les  fonctions  vectorielles  continues  de 
t  etant  des  limites  (uniformes)  de  polynomes,  tout  mouvement  /*,  a  de- 
finition cincmatique  (1),  d'un  point  Mt  peut  s'exprimer  (et  de  bien  de 
manieres  differentes)  comme  limite  cinematique  d'une  suite  de  mouve- 
ments polynomiaux  //w  du  meme  point. 

Revenons  maintenant  au  systeme  classique  (R)  des  axiomes  de  la  me- 

canique.  II  est  admis,  en  mecanique  rationnelle,  que  la  force  F  qui  pro- 
duit  le  mouvement  ju,  d'un  point  materiel  M  doit  exist  er  a  tout  instant  t  de 

6  (meme  si  F  =  0)  et  que  p  est  soumis,  a  cet  instant,  a  la  loi  de  Newton 
(axiome  Rn),  qui  s'ecrit 

F  =  m.l\  (5) 

ou  m  est  la  masse  constante  de  M  et  F  son  acceleration  a  1'instant  t. 

—> 

E.  Mach  et  P.  Painleve  n'ont  vu  en  Rn,  qu'une  definition  de  la  force  F  qui 
produit  le  mouvement,  mais  G.  Hamel  y  reconnut  une  relation  effective, 
car  il  exist  e  des  classes  (0)  de  phenomenes  physiques,  telles  qu'il  y  ait, 
pour  chacune  d'ellcs,  lorsque  m  designe  la  masse  constante  du  point  ma- 
teriel M  une  loi  generale 

F  =  m.0(r,  v,  t), 

ou  0  est  une  fonction  vectorielle  des  variables  r,  v,  t  attachee  £  la  classe 

(0)  —  et  y  constituant  bien  souvent  le  vecteur  d'un  champ.  S'il  est  ques- 

—  > 
tion  d'un  mouvement  ^  determine,  repere*  par  (1)  et  tel  que  v(t)  existe  en  6, 

->      —  > 
les  formules  (3),  (4)  montrent  que,  pendant  le  mouvement,  il  y  a  F  =  F(t), 


244  ALEXANDRE    FRODA 

de  sorte  que  la  force  qui  produit  p,  est,  en  ce  cas,  egale  a  une  fonction  vec- 
torielle  de  /  et  seulement  de  t. 

L'existence  d'une  telle  force,  produisant  un  mouvement  determine*, 

realisable  en  C  est  assuree  par  un  axiome  Rd.  On  peut  prouver  que  1'ac- 

— > 
celeration  F,  dont  1'existence  est  admise  en  (5)  n'y  intervient  que  sous  la 

forme  d'acceleration  prospective  a(t) :  _+      ^ 

1)  Voici  un  premier  argument :  S'il  y  avait  F  =  A  pour  un  mouvement 

/*  realisable  en  C  et  a  tout  instant  t  de  6,  les  discontinuites  de  F(t)  de- 

vraient  avoir,  en  vcrtu  de  (5),  les  caracteres  d'une  derivee  vectorielle  A(t) 
et  ne  pourraient  etre  done  de  premiere  espece  (c.-a-d.  presenter  un  saut). 
Or  cela  est  centred  it  par  des  indications  claircs  de  1' experience  physique, 
comme  le  montrcnt  les  exemples  suivants : 

«.  Considerons  la  force  discontinue,  qui  met  en  mouvement  le  poids 
equilibre  de  la  machine  d'Atwood,  a  1'instant  d'arret  de  la  masse  ad- 
ditionnelle.  Cette  force  execute  un  saut. 

b.  Considerons  la  force  discontinue,  qui  produit  le  mouvement  d'un 
point  soumis  a  r attraction  newtonienne  d'une  surface  sphcrique  fermce, 
a  1'instant  oil  il  la  travcrserait.  Cettc  force  execute  un  saut. 

2)  Voici  un  second  argument:  Lorsque  1'action  d'une  force  cesse  de 

— » 

s'exercer  (F  —  0)  sur  un  point  materiel  M,  il  continue  a  se  deplacer  d'un 
mouvement  rectiligne  et  uniforme  en  vertu  de  sa  vitesse  acquise.  Or  si  Ton 

avait,  en  (5),  1'egalite  F  =  A(t),  a  chaque  instant  t  de  6,  la  force  F  serait 

predetermince  a  I1  instant  /o  p^ir  le  mouvement  antericur  a  IQ,  puisque  1'ac- 

— >  — >  -> 

celeration  A  (to)  =  a(/o)  est  donnee  par  les  valeurs  de  r(t),  pour  t  <  IQ. 
Cela  est  non  seulement  paradoxal,  mais  devient  evidemment  absurde, 

lorsque  /*"(/)  est  discontinue  pour  /  —  IQ  et  de  plus,  contrcdit  la  conception 
d'une  force,  cause  de  modification  du  mouvement  d'inertie,  dont  1'exis- 
tence est  assuree  par  un  axiome  Ri. 

Citons  aussi  deux  axiomes  de  (R),  sc  completant  1'un  1'autre:  L'axiome 
Re  affirme  que  le  mouvement  resultant  (cinematique)  de  mouvements 
realisables  en  C  est  aussi  realisable  en  C  et  1'axiome  Rf  affirme  que  sous 
Faction  simultanee  de  plusieurs  forces,  c'est  la  force  egale  a  leur  resultante 
vectorielle,  qui  les  remplace. 

Une  question  delicate,  deja  signalee  en  passant,  est  la  suivante:  Le 
systeme  (R)  des  axiomes  classiques,  y  inclus  Ra,  est-il  aussi  une  condition 


LA   FINITUDE    EN    MECANIQUE   CLASSIQUE  245 

suffisante  pour  que  tout  mouvement,  defini  cincmatiquement  fut  aussi 
realisable  en  C  ?  Cela  n'est  pas  du  tout  vraisemblable  et  il  est  meme  dou- 
teux,  qu'il  puisse  exister  un  systeme  d'axiomes  representant  des  condi- 
tions nccessaires  et  suffisantes  afin  que  tout  mouvement  JLL  les  satisfaisant 
fut  realisable  en  C. 

Nous  pouvons  enoncer  pourtant  des  conditions  suffisantes  pour  que 
certains  mouvemcnts  soient  realisables  en  C.  Les  voici,  sous  forme  de 
propositions  que  Ton  peut  demontrer  sans  faire  appel  aux  axiomes  de  la 
mecanique  rationnelle ;  mais  par  une  methode  constructive : 

I.  Tout  mouvement  polynomial  d'un  point  materiel  M  est  realisable  en  C. 
II.  Tout  mouvement  JLI  est  realisable  en  C  en  meme  temps  que  ses 
projections  ^LX,  fiy,  //2  sur  les  axes. 

III.  Lorsqu'un  mouvement  ju  d'un  point  materiel  M,  defini  cinema- 

tiqucment   en   un   laps  de  temps  d  =  [/o,  li]   possede  une  acceleration 

— > 

prospective  a(t)  continue  et  qui  ne  change  pas  une  infinite  de  fois  son 
orientation  dans  1'espace,  le  mouvement  //  est  realisable  en  C. 

Les  demonstrations  des  propositions  I,  II  et  III  consistent  a  mettre 
en  evidence  la  possibilite  de  construire  le  mouvement  JLL  respect  if  a  1'aide 
de  mecanismes  convcnablcs,  quand  on  fait  abstraction  des  resistances 
passives.  Ces  constructions  jouent  le  role  de  modeles  cxistentiels. 

Nous  pouvons  enoncer  aussi  des  conditions  necessaires  a  ce  qu'un  mou- 
vement soit  realisable  en  C,  en  completant  d'une  part  le  systeme  (R)  avec 
les  axiomes  de  finitude  (F),  tout  en  abandonnant  d'autre  part  I'axiome 
Ra,  qui  affirmait  a  priori  Texistence  de  Tacceleration  a  tout  instant  d'un 
mouvement  realisable  en  C. 

Voici  enfin  1'enonce  de  nos  axiomes  de  finitude,  valables  enmecanique 
rationnelle. 

(F).  Lorsqu'un  mouvement  p  est  realisable  en  C,  dans  un  laps  de  temps  d, 
fini,  il  satisfait  aux  conditions: 

F\.  Parmi  les  suites  de  mouvements  reguliers,  realisables,  ayant  pour 
limitc  cincmatique  le  mouvement  /*,  il  existe  une  suite  de  mouvements 

/J>n=i,2,-»,  tels  que  les  forces  Fn  qui  les  produisent  soient  bornees  dans  leur 
ensemble.  _^ 

F2.  La  force  F(t),  qui  produit  le  mouvement  JLL  en  d  ne  peut  changer 
d'orientation  une  infinite  de  fois,  en  aucune  suite  indefinie  a  (croissante, 
resp.  decroissante)  d'instants  successifs. 

On  doit  remarquer  que  1'adjonction  des  axiomes  de  finitude  (F)  a  un 
systeme  classique  (R)  d'axiomes  du  mouvement  en  C  doit  etre  effectuee, 


246  ALEXANDER   FRODA 

en  mcme  temps  que  1'abandon  de  Faxiome  Ra.  Mais  Ton  ne  peut  renoncer 
a  1'hypothese  d'existence  de  1' acceleration,  qu'en  modifiant  a  la  fois  Tex- 
pression  d'autres  axiomes  de  (R),  afin  de  ne  plus  admettre  explicitement 

1' existence  des  v(t)  et  A(t),  pour  tout  t  e  d.  Le  nouveau  systeme  (R) 
d'axiomes,  ainsi  obtenu,  remplacera  (R)  et  nous  allons  en  exposer  les 
principales  implications,  oil  le  role  des  axiomes  (F)  est  essentiel. 

Afin  de  les  obtenir,  on  s'appuiera  sur  des  proprietes  generates  des 
fonctions  vectorielles  de  /,  ainsi  que  sur  les  propositions  I)  II)  III)  ci- 
dessus,  qui  expriment  des  conditions  suffisantes,  afin  que  certains 
mouvements  soient  realisables  en  C.  L'on  obtient  les  resultats  suivants, 
qui  expriment  des  proprietes  appurtenant  a  tout  mouvement  p,  realisable  en 
C: 

1°.     II  existe  a  chaque  instant  t  de  6  une  vitesse  complete  v(t)  continue 

— >  — > 

et  des  accelerations  prospective  a(t]  et  retrospective  a(/). 

2°.  L' acceleration  prospective  a(t)  est  prospectivement  continue  et  ne 
possede  qu'un  nombre  fini  d'instants  de  discontinuity  en  un  laps  d  fini. 

3°.     L'acceleration  complete  A  (t)  existe  et  est  continue  a  chaque  instant 

— >     — >• 
/  de  6,  sauf  en  un  nombre  fini  (ou  nul)  d'instants  t^  de  d,  oil  a(t),  a(t)  sont 

discontinues  et  a(tjc)  ^  a(^). 

4°.  To itt  p,  est,  en  d,  soit  un  mouvement  regulier,  soit  une  succession  finie 
de  mouvements  reguliers. 

La  demonstration  des  proprietes  precedentes  utilise  1'appareil  mathe- 
matique  de  la  theorie  des  fonctions  vectorielles  de  variables  reelles  [1]. 
A  part  quelques  propositions  connues,  ou  qui  etendent  directement  aux 
fonctions  vectorielles  des  proprietes  classiques  des  fonctions  numeriques 
de  variable  reelle,  nous  avons  fait  appel  bien  sou  vent  a  une  proposition 
inspiree  par  ces  recherches  mcme  et  que  voici: 

,,Lorsque  parmi  les  vecteurs  derives  prospectifs  (resp.  retrospectifs) 

d'une  fonction  vectorielle  V(t)  pour  t  =  to,  fonction  possedant  des  vec- 
teurs derives,  bornes  dans  leur  ensemble  en  6  —  pi,  t%\,  to  e  6,  ily  a  deux 

— >    — > 
vecteurs  derives  Z)i,  Z>2,  faisant  entre  eux  un  angle  non-nul,  il  existe  un 

vecteur  variable  W(rp)  egal  a  la  derivee  vectorielle  unique 


LA   FINITUDE   EN    MECANIQUE   CLASSIQUE  247 

pour  t  =  TP  et  qui  change  son  orientation  dans  1'espace  une  infinite  de 
fois,  dans  unc  suite  de  valeurs  TP,  p  =  1,2,  . . . ,  tendant  vers  IQ  en  de- 
croissant  (resp.  en  croissant)". 

J'ajoutc,  que  —  par  definition  —  un  ,,vecteur  derive  prospectif"  du 

vecteur  variable  V(t),  pour  t  =  to,  correspond,  par  analogic,  a  Tun  des 
coefficients  differenticls  d'une  fonction  reelle  /(/)  a  droite,  tandis  que  la 
,,dcrivee  vectorielle"  correspond  a  la  derivee  unique,  pour  t  =  to. 

Les  resultats  precedents  de  1  °  a  4°  font  directement  appel  aux  axiomes, 
notes  precedemment  par  Rd,  Ri,  Rn,  Re,  Rf,  Fl,  F2  et  n'utilisent  pas 
1'axiome  Ra.  Les  autres  axiomes  Rr,  sur  1'egalitc  de  Faction  et  de  la  reac- 
tion et  Ru,  qui  assure  1'unicite  d'un  mouvement  realisable  en  C,  pour  des 
conditions  initiales  donnees  sous  1'action  de  forces  donnees,  n'y  inter- 
viennent  pas,  du  moins  explicit ement.  Or  on  a  vu  que  certains  axiomes  de 
(R)  doivent  etcr  exprimes  sous  une  forme  modifiee,  avant  de  former  avec 
les  axiomes  (F)  le  nouveau  systeme  (R).  Voici  des  exemples,  qui  font  ap- 
paraitrc  les  modifications  en  question : 

1°.     Axiome  Ri  (loi  d'inertie) :  ,,Lorsqu'a  un  instant  initial  to  du  laps  d, 

— > 
un  point  materiel  M  posscde  une  vitesse  retrospective  V-(to)  et  qu'aucune 

force  ne  s'excrce  sur  lui  pendant  d,  le  point  M  decrit  en  d  un  mouvement 

— >  —> 

rectiligne  de  vitesse  v(t)  constamment  egale  a  v-(to)". 

En  usant  de  la  vitesse  retrospective  intiale,  Ton  evite  de  reintroduire 
(memc  sous  une  forme  affaiblie)  1'hypothesc  d'existence  de  la  vitesse  et  il 
suffit  en  effet,  d'admettre  physiquement,  qu'on  dispose  du  mouvement  de 
M  dans  un  laps  de  temps,  aussi  petit  qu'on  veut,  anterieur  a  £o- 

2°.     Axiome  En  (loi  de  Newton) :  ,,Si  dans  un  mouvement  p  realisable 

en  C  d'un  point  materiel  M,  possedant  a  chaque  instant  t  e  d  une  vitesse 

—> 

v(t)  continue,  il  existe  a  un  certain  instant  t\  e  d  une  acceleration  prospec- 

—>  — > 

tive  a  et  si  la  force,  qui  s'exerce  sur  M  est  F,  il  y  a 

— >  — > 

F  =  m.a, 

oil  m  est  une  constante,  dependant  de  M  et  independante  du  mouvement 
f.'d-d.  de  t."  _+  ^ 

II  est  clair  que  Texistence  de  1'acceleration  prospective  a  =  a(ti)  n'est 
admise,  dans  cet  enonce,  que  pour  lf instant  t  —  t\. 

Les  axiomes  de  finitude  (F)  ne  pretendent  pas  a  etre  acceptes,  comme 
Taxiome  Ra,  sans  confronter  1'experience.  L'on  peut  concevoir  des  ex- 


248  ALEXANDRE   FRODA 

periences  dont  les  resultats  previsibles  constituent  une  verification  de  ces 
axiomes.  Voici  le  schema  d'une  de  ces  experiences,  ayant  lieu  sans  re- 
sistances passives  et  utilisant  des  solides  parfaitement  elastiques: 

Soit  un  petit  pendule  simple  vertical  P,  dont  les  oscillations  sont  limi- 
tees  de  chaque  cote  par  des  obstacles  plans,  verticaux,  symetriques  par 
rapport  au  plan  vertical  V  contenant  1'axe  de  suspension.  Ces  obstacles 
sont  relies  au  sol,  de  maniere  a  posseder  des  mouvements  uniformes, 
autonomes,  independants  des  chocs  du  petit  pendule  et  tels  qu'ils  arrivent 
simultanement  a  1'instant  t\  en  V.  Selon  les  lois  classiques  du  choc  le 
pendule  devrait  effectuer  une  infinite  de  demi-oscillations,  d' amplitudes 
decroissantes,  en  un  laps  6  =  [to,  /i],  ce  qui  contredirait  F2.  Or,  en  realitc, 
le  nombre  d'oscillations,  ne  pent  ctre  que  fini,  ce  qui  se  verifie  aisement, 
si  Ton  tient  compte  de  la  duree  des  chocs,  calculable  scion  la  loi  de  Herz. 
C'est  pourquoi  1'axiomc  F2  sera  verifie  par  cette  experience.  Negliger 
la  duree  des  chocs  engendre  des  paradoxes,  comme  celui  remarque  par 
D.  Gale  [4],  qui  pensait  avoir  signale  un  cas  d'indetermination  en  mc- 
canique  classique. 

On  pcut  aussi  etablir  par  le  raisonncment  I'mdependance  des  axiomes 
(F)  par  rapport  au  systeme  classique  {(R)  —  (Ra)}  d'axiomes. 

La  question,  que  pose  r extension  eventuelle  des  axiomes  de  finitude, 
valables  en  C,  aux  phenomenes  etudies  par  les  mecaniques  notivelles  est 
un  probleme  ouvert. 

On  peut  rapport er  a  ce  probleme  quelques  faits  bicn  connus,  elemen- 
taires,  qu'on  peut  relier  aux  axiomes  (F)  et  a  lours  implications  en  C: 

1)  Dans  la  mecanique  cle  la  relativite  generate,  oil  la  masse  est  fonction 
de  la  vitesse  du  mouvement,  il  y  a  la  vitesse  c  de  la  lumiere,  qui  pose  une 
borne  finie  a  la  vitesse  de  tout  mouvement,  ce  qui  doit  affccter  1'expres- 
sion  de  I'axiome  Fl. 

2)  En  mecanique  quantique  Ton  se  rappelle  que  1'existence  d'une  vi- 
tesse a  ete  mise  en  doute,  des  les  premieres  etudes  du  mouvement  brow- 
nien.  Ainsi,  en  experimentant,  J.  Perrin  signalait  une  analogic  d'aspect 
du  mouvement  brownien  aux  fonctions  sans  derivees  [9,  p.  164],  ce  qui 
confirmait  les  vues  de  Einstein,  lequel,  dans  ses  etudes  theoriques,  avait 
demontre  auparavant  que  la  vitesse  moyenne  en  At  du  mouvement  d'une 
particule  ne  tend  vers  aucune  limite,  lorsque  la  duree  At  tend  vers  zero. 
II  concluait  en  faisant  remarquer  que  pour  1'observateur  de  ce  mouvement 
la  vitesse  moyenne  lui  apparaitrait  comme  vitesse  instantanee,  mais  qu'en 
fait  elle  ne  represente  aucune  proprieteobjectivedu  mouvement  soumisa 


LA    FINITUDE    EN    MECANIQUE    CLASSIQUE  249 

1' investigation,  du  moins  si  la  theorie  correspond  aux  fails,  ajoutait-il  [2J. 
Par  ces  paroles  d'extreme  prudence,  je  conclus  aussi  mon  expose. 


Bibliographic 

[1]     BOURBAKI,  N.,  Fonctions  d'une  variable  reelle.  Livre  IV,  Chap.  I,  II,  III,  Pans 

1949. 
[2]    EINSTEIN,  A.,  Zur  Theorie  der  Brownschen  Bewegung.  Annalen  der  Physik, 

Serie  4,  Vol.  19  (1906),  pp.  371-381. 
[3]    FRODA,  A.,  Sur  les  fondements  de  la  mecanique  des  mouvements  realisables  du 

point  materiel  (en  roumain).  Studii  §1  Cercetari  Matematice,  t.  Ill,  Bucurc§ti 

1952. 
[4]     GALK,    D.t   An   indeterminate  problem   in  classical  mechanics.    Amer.    Math. 

Monthly,  vol.  59  (1952),  pp.  291-295. 
[5]     HAMKT,,  G.,  a)   Ueber  die  Grundlagen  der  Mechanik.  Math.  Annalen,  Bd.  66 

(1908),  pp.  350-397. 

b)  Elementare  Mechanik.  Leipzig,  1912. 

c)  Die  Axiome  der  Mechanik.  Handbuch  der  Physik,  Bd.  5,  Berlin  1927,  pp. 
1-42. 

[6]     HEISKNBERG,  W.,  Die  Physikalischen  Prinzipien  der  Quanten-theorie.  Leipzig 

1941. 
[7]     HKKZ,  H.,  Die  Prinzipien  der  Mechanik  in  neuem  zusamtuenhange  dargestellt, 

Gesammelte  Werhe.  Bd.  Ill,  Leipzig  1910. 
[8]    MACH,  E.,  La  Mecanique,  expose  historique  et  critique  de  son  developpement 

(trad.  Km.  Bcrtrand).  Pans  1904. 
[9J     PERRJN,  J.,  L' A  tome.  Pans  1912. 
[10]     ZORETTI,  L.,  Les  principes  de  la  mecanique  classique.  Memorial  des  Sciences 

Math.,  Paris  1928. 


Symposium  on  the  Axiomatic  Method 


THE  FOUNDATIONS  OF  RIGID  BODY  MECHANICS  AND 

THE  DERIVATION  OF  ITS  LAWS  FROM  THOSE  OF 

PARTICLE  MECHANICS 

ERNEST  W.  ADAMS 

University  of  California,  Berkeley,  California,  U.S.A. 

1 .  Introduction.  This  paper  has  three  purposes :  ( 1 )  to  give  a  system  of 
axioms   for  classical  rigid    body    mechanics    (henceforth    abbreviated 
'RBM')',  (2)  to  show  how  these  axioms  can  be  derived  from  those  of 
particle  mechanics  (abbr.  'PM')]  and  (3),  using  the  foregoing  derivation 
as  an  example,  to  give  a  general  characterization  of  the  notion  of  're- 
duction' of  theories  in  the  natural  sciences.  The  axioms  to  be  given  are  due 
jointly  to  Herman  Rubin  and  the  author.  They  comprise  what  may  be 
thought  of  as  the  theory  of  rigid  motions  under  finite  applied  forces  with 
moments  of  inertia  given.  That  part  of  RBM  which  deals  with  the  calcu- 
lation of  moments  of  inertia  from  known  mass  distributions  is  omitted, 
since  the  laws  of  motion  can  be  stated  directly  in  terms  of  total  masses 
and  moments  of  inertia.  Similarly,  the  theory  of  impacts,  which  cannot 
be  represented  in  terms  of  finite  forces,  is  excluded.  In  the  axioms,  the 
laws  of  rigid  motion  are  presented  de  novo,  and  are  not,  as  is  usually  the 
case,  presented  as  deductive  consequences  of  the  laws  of  PM.  It  is  our 
contention  that,  in  spite  of  superficial  differences  from  the  more  well- 
known  examples,  the  derivation  ot  the  laws  of  RBM  from  those  of  PM 
can  be  viewed  as  an  example  of  reduction.  In  section  3  we  shall  analyse 
the  logical  relation  which  must  hold  between  two  theories  in  order  that 
one  should  be  reduced  to  the  other,  and  then  in  the  final  section  we  shall 
indicate  how  RBM  may  be  reduced  to  PM  in  accordance  with  the  theory 
of  reduction  previously  given. 

Because  of  limitations  of  space,  our  discussion  both  of  the  theories  of 
RBM  and  PM  and  of  the  general  concept  of  reduction  and  its  specific 
application  to  RBM  and  PM  will  be  limited.  A  complete  formal  develop- 
ment of  these  topics  is  given  in  the  author's  Ph.  D.  dissertation,  The 
Foundations  of  Rigid  Body  Mechanics  [1], 

2.  Axioms  of  Rigid  Body  Mechanics.  Our  axioms  are  based  on  seven 
primitive  notions,  five  of  which  are  closely  analagous  to  the  primitive 

250 


LA   FINITUDE    EN    MECANIQUE   CLASSIQUE  251 

notions  of  McKinsey,  Sugar,  and  Suppes'  axiomatization  of  classical  PM 

[5].  These  seven  are  denoted  'K',  'T',  'g',  'R',  'H',  'p,  and  '0',  and  their 

intended  interpretations  are  as  follows: 

K  is  a  set  of  rigid  bodies. 

T  is  an  interval  of  real  numbers  representing  clock  readings  during  an 
interval  of  time. 

g  is  a  function  from  K  to  the  positive  real  numbers,  such  that  for  every 
rigid  body  k  in  K,  g(k)  is  the  mass  of  k  as  measured  in  some  fixed  units. 

R  is  a  function  from  K  X  T  to  Er  (the  set  of  ordered  r-tuples  of  real 
numbers)  such  that  for  each  k  in  K  and  t  in  T,  R(k,  t)  is  the  r-vector 
representing  the  position  of  the  center  of  mass  of  k  at  the  instant  when 
the  clock  reads  t,  as  measured  relative  to  a  system  of  cartesian  co- 
ordinate axes,  r-vectors  are  here  construed  to  be  ordered  r-tuples  of 
real  numbers,  and,  of  course,  in  the  ordinary  application,  r  =  3. 

H  is  a  function  from  K  x  T  x  N  (N  being  the  set  of  positive  integers) 
to  Er  x  Er>  such  that  H(k,  t,  n)  represents  the  rith  applied  force  acting 
on  body  k  at  the  time  /  in  the  following  way  :  H  (k,  t,  n)  is  the  r-vector 
representing  the  magnitude  and  direction  of  the  n'th  applied  force,  and 
H2(k,  t,  n)  is  the  r-vector  representing  the  position  of  the  point  of 
application  of  this  force  relative  to  a  specially  selected  system  of  co- 
ordinate axes  which  are  parallel  to  the  original  reference  frame,  but 
which  have  their  origin  at  the  center  of  mass  of  k.  We  shall  call  the 
original  axes  the  'axes  of  the  space/  and  the  new  axes  the  'non-ro- 
tating axes  of  k'. 

[JL  is  a  function  from  K  to  the  set  of  r  by  r  matrices  with  real  components, 
such  that  for  each  k  in  K,  ju(k)  is  a  matrix  representing  the  moment  of 
inertia  tensor  of  k  relative  to  still  another  set  of  coordinate  axes,  which 
we  shall  call  the  'rotating  axes  of  k.'  The  rotating  axes  of  k  are  a  system 
of  cartesian  coordinate  axes  which  have,  like  the  non-rotating  axes  of 
k,  their  origin  at  the  center  of  mass  of  k,  but  which  rotate  with  k  so 
that  they  always  maintain  a  fixed  relation  to  the  parts  of  k.  If  k  is 
composed  of  a  finite  number  of  mass  points  with  masses  mi,  .  .  .  ,  m^ 
and  positions  LI,  .  .  .  ,  LI  relative  to  the  rotating  axes  of  k,  then  the 
matrix  jn(k)  is  the  sum  of  the  products  : 


1  The  transpose  L*  of  an  r-  vector  L  is  a  'column  vector'  with  r  rows,  and  the 
dyadic  product  of  a  column  vector  L*  and  a  row  vector,  say  M  (both  r-  vectors)  is 


252  ERNEST   W.    ADAMS 

The  matrix  ju(k)  defined  as  above  is  symmetric  and  positive  semi- 
definite  since  all  of  the  w/s  are  positive.  It  is  to  be  particularly  noted 
that  moment  of  inertia,  as  characterized  here,  is  independent  of  time, 
because  of  the  fact  that  it  is  defined  relative  to  the  rotating  axes  of 
k,  which  always  remain  fixed  within  k.  To  transform  to  the  time- 
dependent  moment  of  inertia  function  used  in  many  formulations  of 
the  laws  of  RBM  (e.g.,  Milne  [7],  p.  267  or  Joos  [3],  p.  137  or  McConnell 
[4],  p.  233),  it  is  necessary  to  introduce  our  last  primitive  notion,  a 
function  representing  the  orientation  in  space  of  the  rotating  co- 
ordinate axes  of  k. 

0  is  a  function  from  K  x  T  to  the  set  of  r  by  r  orthogonal  matrices,  such 
that  for  each  k  in  K  and  t  in  T,  0(k,  t)  represents  the  'orientation'  of  the 
set  of  rotating  coordinates  of  k  at  time  t  relative  to  the  axes  of  the 
space.  0(k,  t)  gives  the  orientation  of  the  moving  axes  in  the  sense  that 
for  each  /  —  1 ,  . . . ,  r,  0(k,  t)j  —  the  /'th  row  of  the  matrix  0(k,  t)  —  is 
the  unit  vector  in  the  direction  of  the  /'th  axis  of  the  moving  axes  of 
k  at  time  t.  Or,  0(k,  t)i  j  is  the  cosine  of  the  angle  between  the  i'th 
rotating  axis  of  k  at  time  t  and  the  /'th  axis  of  the  space. 
The  equation  which  relates  the  time  dependent  moment  of  inertia 

function  //(&,  t)  and  the  time-independent  moment  of  inertia  function  p 

is  simply: 

p(k,t)    =0*(k,t)fA(k)0(k,t). 

The  axioms  for  RBM  can  now  be  stated  in  terms  of  the  seven  primitive 
concepts  just  discussed.  The  style  in  which  these  axioms  are  formulated  is 
very  similar  to  that  of  the  axioms  for  classical  particle  mechanics  due  to 
McKinsey,  Sugar,  and  Suppes  [5],  and  the  axioms  for  relativistic  particle 
mechanics  due  to  Rubin  and  Suppes  [10]  (see  also,  McKinsey  and  Suppes 
[6]).  That  is,  the  axioms  are  conditions  which  are  parts  of  the  definition  of 
the  set-theoretical  predicate  system  of  r-dimensional  rigid  body  mechanics. 
Our  axioms  rely  directly  on  the  concept  of  a  system  of  r-dimensional 
particle  mechanics,  which  is  defined  as  follows : 

DEFINITION  1.  An  ordered  quintuple  <P,  T,  m,  S,  F>  is  a  SYSTEM  OF 
CLASSICAL  ^-DIMENSIONAL  PARTICLE  MECHANICS  if  and  only  if  it  satisfies 
axioms  P1-P6. 

PI.  P  is  a  non-empty  finite  set. 
P2.   T  is  an  interval  of  real  numbers. 

an  r  by  r  matrix  (L)*(M)  such  that  the  clement  of  its  i'th  row  and  ;'th  column  is 


FOUNDATIONS    OF   RIGID    BODY    MECHANICS  253 

P3.  5  is  an  r-vector  valued  function  with  domain  P  x  T  such  that  for  all  p 

in  P  and  t  in  T,  d*ldfi(S(p,  t))  exists. 
P4.  m  is  a  positive  real-valued  function  with  domain  P. 
P5.  F  is  an  r-vector  vahied  function  with  domain  P  x  T  x  N,  where  N  is 

the  set  of  positive  integers,  and  for  all  p  in  P  and  t  in  T  the  series 

00 

2  F(p,  t,  i)  is  absolutely  convergent. 
?.=i 
P6.  For  all  p  in  P  and  t  in  T, 


In  the  above  axioms,  P  is  to  be  thought  of  intuitively  as  a  set  of  par- 
ticles, 7^  —  again  —  is  an  interval  of  clock  readings,  m(p)  is  the  mass  of 
particle  p,  S(p,  t)  is  the  r-vector  representing  the  position  of  p  at  time  /, 
relative  to  a  system  of  cartesian  coordinate  axes,  and  F  (p,  t,  i)  is  an 
r-vector  representing  the  magnitude  and  direction  of  the  z'th  force  applied 
to  p  at  time  t  (in  the  case  of  particle  mechanics  it  is  not  necessary  to  take 
into  account  the  point  of  application  of  a  force  since  this  affects  only  the 
rotation,  and  not  the  translation  of  a  particle).  The  only  axiom  embodying 
what  is  normally  thought  of  as  a  'physical  law'  is  P6,  expressing  a  version 
of  Newton's  Second  Law.  The  first  five  axioms  serve  only  to  define  the 
set-theoretical  character  of  the  primitive  notions,  and  state  certain 
continuity  and  differentiability  conditions. 

The  axioms  for  RBM  are  stated  in  Definition  2,  below.  It  will  be  seen 
that  only  two  of  them  contain  ordinary  physical  laws,  and  the  remainder, 
like  axioms  P1-P5  in  Definition  1,  stipulate  the  set-theoretical  type  of 
the  primitives.  Axiom  Rl,  stating  that  the  first  five  elements  of  a  system 
of  RBM  are  themselves  a  system  of  PM,  contains  Newton's  Second  Law, 
since  the  axioms  for  PM  include  this  law;  and  axiom  R5  is  a  version  of 
well-known  tensor  equations  relating  moment  of  inertia,  angular  acceler- 
ation (these  two  being  combined  to  give  the  rate  of  change  of  angular 
momentum),  and  resultant  torque,  or  moment  force. 

DEFINITION  2.  An  ordered  septuple  <K,  T,  g,  R,  H,  p,  0>  is  a  SYSTEM  OF 
^-DIMENSIONAL  RIGID  BODY  MECHANICS  if  and  only  if  satisfies  axioms 
R1-R5. 

Rl.    H  is  a  function  with  domain  K  x  T  x  TV  taking  as  values  ordered 
pairs  of  r-vector  s,  and  if  H1  and  H*  are  r-vector  valued  functions  with 


254  ERNEST   W.    ADAMS 

domain  K  x  T  X  N  such  that  for  all  k  in  K,  t  in  T  and  i  in  N, 


then  </£,  T,  g,  R,  Hly  is  a  system  of  classical  r-dimensional  particle 

mechanics. 
R2.     0  is  a  function  with  domain  K  x  T  taking  as  values  r  by  r  orthogonal 

matrices,  such  that  for  all  k  in  K  and  t  in  T,  d2/dt2(0(k,  t,  ))  exists. 
R3.     IJL  is  a  function  with  domain  K  x  T  taking  as  values  r  by  r  symmetric 

positive  semi-definite  matrices  of  rank  r  or  r  —  1  . 
R4.    For  all  k  in  K  and  t  in  T,  the  series 

00 

2  H2(k,  t,  i,)  X  Hl(k,  t,  i)  is  absolutely  convergent.  2 
/-=i 
R5.     For  all  k  in  K  and  t  in  T, 

r       d2          ~i     °° 

0(k,  t)  X  \j*(k)  --  -  (0(k,  /))  J  =  2#a(*.  ^  ^  X  m(k,  t,  i). 

The  axioms  of  Definition  2  are  all  of  fairly  simple  significance.  Rl  states 
essentially  that  the  system  which  is  formed  by  taking  only  the  masses, 
positions  of  the  centers  of  mass  of  the  rigid  bodies,  and  the  magnitudes  and 
directions  of  the  applied  forces,  constitutes  a  system  of  particle  me- 
chanics; i.e.  it  obeys  the  laws  of  particle  mechanics.  This  axiom  can  be 
regarded  as  a  version  of  the  theorem  that  the  center  of  mass  of  a  system 
of  particles  or  a  rigid  body  moves  as  though  all  the  mass  of  system  were 
located  there,  and  all  of  the  forces  applied  there.  R2  specifies  that  0(k,  t) 
is  an  r  by  r  orthogonal  matrix,  as  is  required  by  the  intended  interpre- 
tation, since  the  rows  of  0(k,  t)  form  a  set  of  orthogonal  unit  vectors  in 
the  directions  of  the  moving  axes  of  k.  It  is  necessary  that  this  function  be 
twice  differentiate  with  respect  to  time  in  order  that  the  rotational  motion 
of  the  body  be  describable  as  due  to  finite  applied  torques.  This  axiom 
also  rules  out  impacts,  in  which  there  may  be  discontinuous  changes  of 
angular  momentum,  and  for  which  the  angular  acceleration  does  not  exist. 

The  symmetry  and  positive  scmi-definiteness  of  ju(k)  required  by  axiom 
R3  follows  also  directly  from  the  intended  interpretation  of  this  concept 
(see  p.  3).  The  restriction  on  the  rank  of  the  matrix  ju,(k)  amounts  to  a 

2  The  matrix  cross-product  AXB  of  two  vectors  A  and  B  is  defined  as  the 
difference  A*B  —  B*A.  This  is  a  skew-symmetric  matrix,  corresponding  to  a 
symmetric  double  tensor.  In  three  dimensions  the  matrix  AXB  depends  only  oil 
three  independent  components  and  is  closely  related  to  the  three-dimensional 
vector  cross  product  representing  the  -moment  of  a  force  B  applied  through  a  lever 
arm  A  . 


FOUNDATIONS    OF    RIGID    BODY    MECHANICS  255 

restriction  on  the  'dimension'  of  the  rigid  body  k.  It  can  be  shown  that  if 
all  masses  are  positive,  and  p,(k]  is  defined  as  on  page  3,  then  the  rank  of 
/j,(k)  is  equal  to  the  dimension  of  k,  defined  as  the  dimension  of  the  smallest 
'hyperplane'  of  Er  containing  all  of  the  points  of  k. 

CO 

Axiom   R4  requires  that  the  sum  ^H2(kft,i)  x  H1(k,t,i)f   which 

represents  the  resultant  moment  force  applied  to  k  at  time  t  relative  to 
the  fixed  coordinate  axes  of  k,  be  absolutely  convergent.  This  requirement 
is  put  on  H  simply  in  order  that  the  resultant  moment  or  torque  should 
not  depend  on  the  ordering  of  the  applied  forces. 

Finally,  axiom  R5  is  a  formulation  of  the  well-known  law  equating  rate 
of  change  of  angular  momentum  and  moment  force.  The  matrix  ex- 

[d  ~| 

ju,(k)  -j-(0(k,t)  ,  and  the  ex- 
pression on  the  left  side  of  the  quation  in  R5  is  the  first  time  derivative 
of  this  angular  momentum,  which  is  according  to  this  equation  equal  to 
resultant  moment  force. 

Remark  1.  The  axioms  for  classical  particle  mechanics  (Definition  1) 
do  not  contain  any  version  of  Newton's  Third  Law,  nor  does  any  version 
of  it  occur  in  axioms  R2  to  R5,  and  therefore  our  axioms  for  RBM  do  not 
include  this  law.  This  omission  may  seem  strange  in  view  of  the  fact  that 
this  is  the  law  which  justifies  neglecting  the  internal  forces  acting  between 
the  parts  of  a  rigid  body  in  computing  its  motion.  Two  comments  are  in 
order  here.  First,  if  Newton's  Third  Law  were  not  true  (as  it  applies  to 
internal  forces  within  rigid  bodies),  it  would  only  be  necessary  to  represent 
all  forces,  internal  as  well  as  external,  by  the  function  //,  and  the  equa- 
tions of  linear  and  angular  acceleration  (axioms  P6  and  R5)  would  still 
hold  true.  Second,  the  fact  that  the  Third  Law  is  true  justifies  the  omis- 
sion of  the  internal  forces,  and  representing  only  the  external  forces  by  //. 
It  would  become  necessary  to  include  Newton's  Third  Law  if  a  distinction 
between  external  and  internal  forces  were  made  within  this  system,  and 
then  the  force  and  moment  force  occurring  in  the  equations  of  motion 
were  defined  to  be  the  resultants  of  the  external  forces  only. 

Remark  2.  Although  Newton's  Third  Law  is  not  included,  our  axioms 
satisfy  two  criteria  of  adequacy  for  mechanical  theories.  First,  the  well 
known  laws  of  rigid  motion,  such  as  Euler's  equations  (Whittaker  (12), 
p.  144),  and  the  tensor  forms,  as  well  as  the  much  simpler  laws  for  two- 
dimensional  rigid  motion  are  derivable  from  our  axioms.  Second,  it  can  be 


256  ERNEST   W.    ADAMS 

shown  that  our  equations  are  deterministic  in  the  sense  that  if  the  initial 
positions  and  velocities  of  the  bodies  are  arbitrarily  prescribed,  and  the 
applied  forces  are  given,  then  the  paths  of  the  bodies  are  uniquely 
determined. 

Remark  3.  It  is  to  be  observed  that,  although  the  primitive  notions  // 
and  0  are  both  defined  in  terms  of  position  vectors  and  mass  in  their 
intended  interpretations,  the  only  formal  connection  between  moment  of 
inertia,  angular  position,  and  mass  and  position  stated  in  the  axioms  is 
through  axioms  R5,  specifying  a  connection  with  moment  force,  and 
axiom  Rl  (including  Newton's  Second  Law),  which  in  turn  links  resultant 
force  with  acceleration  and  mass.  If  the  rotational  and  translational 
concepts  were  completely  independent,  this  would  have  the  odd  conse- 
quence that  it  would  be  possible  to  transform  the  coordinate  axes  of  the 
space  by,  say,  a  Galilean  transformation  and  change  the  unit  of  mass 
measurement,  without  this  being  accompanied  by  a  corresponding 
transformation  in  the  amount  of  inertia  and  angular  position  functions. 
In  turn,  if  the  transformations  of  ^  and  0  were  independent  of  those 
of  space  and  mass,  then  the  former  could  not  be  regarded  as  tensor 
quantities  in  the  usual  sense,  with  prescribed  transformation  laws.  As  was 
noted  above,  the  translational  and  rotational  concepts  are  not  completely 
independent  in  this  theory,  since  they  are  both  linked  to  force.  The  author 
has  not  so  far  been  able  to  determine,  however,  whether  the  two  equations 
of  motion  place  sufficient  constraint  on  the  two  kinds  of  functions,  so  that 
the  transformations  of  the  mass  and  position  functions  uniquely  determine 
the  transformations  of  the  moment  of  inertia  and  angular  position 
functions. 

3.  Reduction.  A  first  glance  at  the  usual  derivation  of  the  laws  of 
RBM  from  those  of  PM  suggests  that  the  reduction  of  RBM  to  PM 
consists  in  the  following:  first,  the  primitive  notions  of  RBM  are  defined 
in  terms  of  those  of  PM,  as  is  indicated  roughly  in  the  intended  inter- 
pretations of  the  primitives  of  RBM,  and  then  the  laws  of  RBM  are 
shown  to  be  derivable  from  those  of  PM,  supplemented  by  the  indicated 
definitions.  Upon  closer  inspection,  however,  two  difficulties  appear  in 
the  above  simple  theory  of  reduction.  The  first  difficulty  is  of  a  technical 
rather  than  of  a  conceptual  nature,  but  is  worth  noting,  none  the  less. 
This  is  simply  that  there  are,  literally,  no  primitive  concepts  in  the  two 
theories  we  have  considered.  The  two  theories  formulated  in  Definitions 


FOUNDATIONS   OF    RIGID    BODY   MECHANICS  257 

1  and  2  are,  of  course,  no  more  than  definitions,  and  the  letters  €P  ,'T', 
'm',  'S',  'F\  and  'K' ',  'g',  'R',  'Hl ',  '//,  and  '0*  are  actually  only  variables 
employed  in  the  definitions  of  the  predicates  'system  of  classical  r- 
dimensional  particle  mechanics/  and  'system  of  r-dimensional  RBM.' 
Each  theory,  in  other  words,  involves  only  one  new  term.  This  apparent 
difficulty  is  circumvented  by  simply  replacing  the  definitions  of  the 
various  concepts  of  RBM  by  a  single  definition  which  combines  all  of 
them,  and  which  defines  the  predicate  'system  of  RBM'  in  terms  of 
'system  of  PM.'  We  shall  not  pursue  this  problem  here,  however,  but  turn 
our  attention  to  the  second  difficulty,  which  is  more  serious. 

Why,  one  may  ask,  should  one  bother  to  define  the  concept  of  a  system 
of  RBM  in  terms  of  that  of  a  system  of  PM,  when,  in  fact,  both  are 
defined  in  terms  of  the  concepts  of  pure  mathematics,  as  they  are  in 
Definitions  1  and  2?  If  it  is  the  case  that  the  concept  of  a  system  of  RBM 
is  definable  in  terms  of  set-theoretical  concepts  alone,  as  in  Definition  2, 
and  the  laws  of  RBM  follow  from  those  of  set  theory  augmented  by  the 
definition  in  question,  then  it  should  follow,  according  to  the  theory  of 
reduction  just  proposed,  that  RBM  is  reducible  to  set  theory. 

On  intuitive  grounds,  any  definition  of  'reduction'  which  has  a  con- 
sequence that  some  physical  theory  is  reducible  to  set  theory  and  analysis, 
seems  unacceptable. 

The  solution  we  shall  propose  to  the  difficulty  raised  above  (assuming  it 
is  felt  to  be  one)  involves  a  revision  of  the  concept  of  a  theory  which  we 
have  been  tacitly  assuming  up  until  now;  i.e.,  that  a  theory  —  in  particu- 
lar the  theories  of  RBM  and  PM  —  is  simply  the  set-theoretical  predicate 
defined  by  its  axioms.  3  4  This  revision  is  suggested  by  a  closer  examina- 
tion of  the  situation  which  prevails  when  one  theory  is  reduced  to  another. 
The  reduction  of  RBM  to  PM  involves  more  than  an  arbitrary  formal 
definition  of  the  concepts  of  the  former  theory  —  moment  of  inertia  and 
angular  position  —  in  terms  of  those  of  the  latter,  from  which  the  laws 
of  RBM  can  be  shown  to  follow.  As  Nagel  [8]  has  pointed  out,  these  'defi- 
nitions' are  actually  empirical  hypotheses,  and  as  such,  ones  which  might 

3  Since  the  set  theoretical  predicate  is  determined  by  the  axioms,  and  conversely 
it  determines  the  axioms  in  the  sense  that  the  axioms  are  simply  statements  which 
are  true  of  all  and  only  those  entities  which  satisfy  the  predicate,  it  makes  little 
difference  whether  theories  arc  constnied  as  set -theoretical  predicates  or  as  sets  of 
axioms.  Thus,  one  would  not  expect  to  get  around  the  difficulty  by  simply  going 
over  to  the  linguistic  version  of  a  theory,  which  construes  it  as  a  set  of  axioms  plus 
all  of  the  theorems  derivable  from  the  axioms. 

4  See  [6]  for  a  discussion  of  this  concept  of  a  theory. 


258  ERNEST   W.    ADAMS 

be  false.  There  is,  however,  nothing  in  the  account  so  far  given  of  theories 
and  their  mutual  relations  which  takes  into  account  the  fact  that  theories 
and  the  hypotheses  represented  by  the  'definitions'  involved  in  the 
reduction  of  one  theory  to  another  may  be  either  true  or  false.  Our  first 
step,  then,  in  analyzing  the  logic  of  reduction,  will  be  to  elaborate  the 
concept  of  a  theory  in  such  a  way  that  it  will  be  possible  to  speak  of  its 
truth  or  falsity.  5 

There  are  undoubtedly  many  ways  of  bringing  the  concept  of  truth  or 
correctness  into  formal  consideration.  One  way,  for  example,  is  to  require 
that  the  axioms  be  consistent  with  a  set  of  observation  sentences.  In  any 
case  there  must  be  some  kfnd  of  reference  beyond  the  axioms  themselves 
to  the  'things'  they  are  supposed  to  describe,  or  to  observations  about 
those  objects.  We  have  chosen  to  approach  this  through  the  notion  of  an 
intended  interpretation  or  an  intended  model  of  the  theory.  Very  roughly 
speaking,  an  intended  model  of  a  theory  is  any  system  which,  for  one 
reason  or  another,  it  is  demanded  that  the  axioms  conform  to.  There  will, 
in  general,  be  a  large  number  of  systems  which  satisfy  the  axioms  of  a 
theory,  but  usually  for  theories  in  empirical  science  only  a  few  of  these 
will  be  intended  applications  or  intended  models.  For  example,  in  the 
case  of  classical  PM ,  axiomatized  in  Definition  1,  the  ordered  quintuple 
<P,  T,  m,  S,  Fy  such  that 

P  =  {\} 
T  =  [0,  1] 
m(\)  =  1 

S(\,t)  =  <0,  0,  0>  0  <t  <  1 
F(l,  t,  n)  =  <0,  0,  0>  0  <  t  <  1 ;  n  —  1,  2,  3,  .  .  . 

5  Some  readers  will  object  to  speaking  of  the  truth  or  falsity  of  a  theory,  and 
would  prefer  to  use  the  terminology  of  confirmation.  To  include  the  concept  of  the 
confirmation  of  a  theory  relative  to  a  given  set  of  data  would  be  to  proceed  in  the 
same  direction  we  propose  to  go:  i.e.,  to  include  some  connections  between  the 
fundamental  or  defined  concepts  of  the  theory  and  either  observation  or  observation 
sentences,  which  alone  will  determine  either  the  truth  or  the  degree  of  confirmation 
of  the  theory.  However,  the  theory  of  confirmation  is  at  present  in  such  an  imperfect 
state,  as  it  relates  to  theories  of  high  complexity,  that  it  would  be  extremely 
difficult  if  not  impossible  to  found  a  precise  analysis  of  reduction  on  it.  On  the  other 
hand,  the  work  of  Tarski  [1 1J  and  others  on  the  concepts  of  truth  and  satisfaction  and 
others  relating  to  the  interpretation  of  formal  systems  makes  these  concepts  ideal 
tools  for  use  in  precise  logical  analyses.  Our  use  of  the  concept  of  truth  rather  than 
confirmation  is  thus  dictated  by  the  requirements  of  logical  precision;  it  does  not 
imply  that  the  author  believes  that  in  any  'ultimate'  sense  the  concept  of  truth  is 
fundamental  and  that  of  confirmation  only  derivative. 


FOUNDATIONS   OF   RIGID    BODY    MECHANICS  259 

satisfies  the  axioms  of  PM,  though  it  is  not  normally  taken  as  an  intended 
model  simply  for  the  reason  that  1  is  not  a  particle.  On  the  other  hand, 
the  system  in  which  P  is  the  set  of  planets  of  the  solar  system  together 
with  the  sim,  m  gives  the  masses  of  these  objects  (in  some  fixed  units),  S 
gives  their  locations  relative  to  a  system  of  cartesian  coordinate  axes  fixed 
with  respect  to  the  fixed  stars,  and  F  gives  the  gravitational  forces  acting 
between  the  sun  and  planets,  is  an  intended  model  of  PM  (or,  at  any  rate, 
was  often  taken  to  be  one.)  It  is  this  second  kind  of  intended  model  which 
it  is  expcctes  should  satisfy  the  axioms,  and  the  axioms  or  the  theory  is 
judged  true  or  false  according  as  the  intended  models  satisfy  the  axioms 
or  not. 

If  truth  and  falsity  arc  to  be  defined,  we  have  seen  that  two  aspects  of  a 
theory  must  be  brought  into  account:  first,  the  formal  aspect  which 
corresponds  to  the  set-theoretical  predicate  defined  by  the  axioms  (since 
we  wish  later  to  avoid  reference  to  linguistic  entities,  such  as  predicates, 
we  shall  instead  consider  the  extension  of  this  predicate,  which  is  the 
set  of  all  systems  satisfying  the  axioms) ;  and  second,  the  applied  aspect, 
corresponding  to  the  set  of  intended  models.  Formally,  a  theory  T  will 
be  construed  as  an  ordered-pair  of  sets  T  =  <C,  />  such  that  C  is  the  set 
of  all  entities  satisfying  the  axioms,  and  /  is  the  set  of  intended  models. 
We  shall  call  C  the  "characteristic  set"  of  T.  In  the  case  of  classical  PM, 
for  example,  C  is  the  set  of  all  ordered  quintuples  <P,  T,  m,  5,  F>  satis- 
fying axioms  P1-P6.  Just  what  systems  are  comprised  within  the  set  / 
of  intended  models  for  classical  PM  cannot  be  specified  with  precision, 
owing  to  the  vagueness  in  the  physical  concepts  of  'particle/  'position/ 
'mass,'  and  'force.'  Even  to  attempt  an  analysis  of  the  intended  models  of 
classical  PM  would  fall  outside  the  scope  of  this  paper.  It  will  turn  out, 
though,  that  such  an  analysis  is  not  essential  to  our  account  of  reduction, 
which  rests  on  certain  assumptions  about  the  relations  between  the 
intended  models  of  PM  and  RBM,  and  not  on  any  theory  as  to  what 
those  models  are. 

One  thing  which  it  is  essential  to  note  in  connection  with  the  intended 
models  of  PM  is  that  they  are  all  'physical  systems'  in  an  extended 
sense.  They  must  be  entities  which  could  at  least  conceivably  satisfy  the 
axioms,  and  therefore  they  must  be  ordered-quintuples  <P,  T,  m,  5,  F>. 
Roughly,  then,  the  intended  models  will  be  systems  <P,  T,  m,  S,  Fy  such 
that  P  is  a  set  of  particles  (physical  objects  whose  size,  for  the  purposes  of 
the  application,  can  be  neglected,  and  not,  for  example,  numbers),  T  is  a 
set  of  clock  readings  during  an  interval  of  time,  m,  S,  and  F  are  functions 


260  ERNEST  W.    ADAMS 

giving  the  results  of  measurements  of  mass,  position,  and  forces  applied  to 
particles  of  the  system  during  the  time  interval.  Similarly,  the  intended 
models  of  RBM  will  be  ordered  septules  <K,  T,  g,  R,  H,  //,  0>  satisfying 
the  descriptions  in  the  intended  interpretations. 

With  theories  characterized  as  ordered-pairs,  the  first  member  of  which 
is  its  characteristic  set  —  i.e.,  all  entities  satisfying  its  axioms  —  and  the 
second  member  of  which  is  its  set  of  intended  models,  "truth"  becomes 
definable  in  an  obvious  way.  The  theory  is  true  if  and  only  if  all  of  its 
intended  models  satisfy  its  axioms,  otherwise  it  is  false.  If  T  —  <C,  />, 
then  T  is  true  if  and  only  if  /  is  a  subset  of  C. 

In  terms  of  the  modified  conception  of  theory  outlined  above,  it  is 
possible  to  give  what  we  hope  is  a  more  adequate  explication  of  'reduction' 
than  the  one  originally  proposed.  The  'definition'  of  the  fundamental 
concepts  of  the  secondary  theory  of  the  reduction  (in  this  case  RBM)  in 
terms  of  those  of  the  primary  theory  (PM  in  this  case)  represents,  we  have 
argued,  an  empirical  hypothesis.  This  hypothesis  is  one  which  postulates 
that  there  is  a  certain  connection  between  the  intended  models  of  the 
secondary  and  primary  theories.  In  the  case  of  RBM  and  PM,  the  as- 
sumption is  that  every  rigid  body  is  composed  of  particles,  and  that  the 
masses,  positions,  applied  forces,  moments  of  inertia,  and  angular  po- 
sitions or  the  rigid  bodies  are  related  to  the  masses,  positions,  and  applied 
forces  on  the  particles  composing  them  as  outlined  in  the  previous  section. 
This  assumption  is  clearly  about  the  intended  interpretations  of  RBM 
and  PM  and  not  about  all  entities  satisfying  their  axioms,  since  there  will 
be  members  of  the  characteristic  set  of  RBM  which  are  not  physical 
objects  at  all,  and  hence  not  'composed'  of  anything.  Similarly,  in  the 
reduction  of  thermodynamics  to  statistical  mechanics,  it  is  assumed  that 
all  thermal  bodies  are  composed  of  molecules,  and  thet  tha  absolute 
temperature  of  the  body  is  proportional  to  the  mean  kinetic  energy  of  the 
molecules  composing  it.  This  again  is  an  assumption  about  the  objects  to 
which  the  two  theories  are  applied;  i.e.,  about  their  intended  models.  In 
each  reduction,  it  is  assumed  that  every  intended  model  in  the  secondary 
theory  has  a  particular  relation  to  some  intended  model  of  the  primary 
theory. 

It  is  possible  of  formalize  the  above  interpretation  of  the  definition  of 
the  concepts  of  the  secondary  theory  in  terms  of  those  in  the  primary 
theory  as  follows.  Let  T\  =  <Ci,  /i>  be  the  primary  theory,  and  let 
T2  =  <C2,  /2>  be  the  secondary  theory  which  is  reduced  to  TI.  The 
'definition'  in  question  can  be  represented  as  a  hypothesis  that  every 


FOUNDATIONS    OF   RIGID    BODY   MECHANICS  261 

intended  model  i%  e  1%  if  the  secondary  theory  has  a  special  relation  R 
(which  we  shall  call  the  'reduction  relation1)  to  an  intended  model  i\  e  I\ 
of  the  primary  theory  T\.  Although  we  shall  not  attempt  to  formalize  the 
informal  characterizations  of  primary  and  secondary  theories  and  re- 
ductions given  above,  we  shall  set  down  quasi-formally  the  basic  con- 
nection just  stated  between  the  intended  models  of  the  primary  and 
secondary  theories  and  the  reduction  relation  as  Condition  A,  below. 

CONDITION  A.  Let  T\  =  <Ci,  /i>  and  T2  =  <C2,  /2>  be  two  theories 
such  that  T%  is  reduced  to  T\  by  relation  R.  Then  for  all  *2  in  1  2  there  exists 
*i  in  /i  such  that 


Simply  defining  the  intended  models  of  the  secondary  theory  in  terms 
of  the  intended  models  of  the  primary  theory  does  not,  of  course,  reduce 
one  theory  to  the  other.  It  must  also  be  shown  that  in  some  sense,  the 
laws  of  the  secondary  theory  'follow'  from  the  laws  of  the  primary  theory 
together  with  the  definition.  One  way  to  formulate  this  requirement, 
which  avoids  reference  to  such  syntactical  concepts  as  derivability,  is  as 
follows:  it  must  be  the  case  that  if  any  element  c2  has  relation  R  to  some 
element  c\  which  satisfies  the  laws  of  the  primary  theory  (i.e.,  c\  is  in  Ci), 
then  C2  satisfies  the  laws  of  the  secondary  theory  (c2  is  in  C2).  This  second 
requirement  is  formulated  explicitly  in  Condition  B,  below. 

CONDITION  B.  Let  T\  —  <Ci,  /i>  and  T2  =  <C2,  /2>  be  two  theories  such 
that  T2  is  reduced  to  T\  by  relation  R.  Then  for  all  c\  and  c%,  if  c\  is  in  C\ 
and  c^Rci  then  c2  is  in  C2. 


Conditions  A  and  B  do  not,  of  course,  define  the  concept  of  a  reduction 
relation.  However,  they  do  have  one  very  important  consequence:  if  a 
theory  T\  is  reduced  to  a  theory  T\  by  a  relation  R  satisfying  Conditions 
A  and  B,  then  if  T\  is  correct,  then  T2  is  correct.  Thus,  any  reduction 
relation  which  satisfies  Conditions  A  and  B  satisfies  what  seems  to  us  to 
be  the  most  essential  requirement  for  reduction,  namely:  it  must  be  pos- 
sible to  show  that  if  the  primary  theory  in  the  reduction  is  correct  in  that 
all  of  its  intended  models  satisfy  its  axioms,  then  all  of  the  intended 
models  of  the  secondary  theory  satisfy  its  axioms,  and  therefore  the 
secondary  theory  is  also  correct.  This  is  the  core  of  the  reduction  of 
thermodynamics  to  statistical  mechanics.  In  this  case,  what  is  shown  is 
that  if  the  laws  of  statistical  mechanics  are  correct  (and  this  may  be 
doubtful),  and  the  hypothesis  of  the  reduction  is  correct  (which  says  that 
every  thermal  body  is  composed  of  particles,  and  its  temperature  is 


262  ERNEST   W.    ADAMS 

proportional  to  the  mean  kinetic  energy  of  the  particles  composing  it), 
then  thermodynamics  is  correct. 

Remark.  As  has  been  pointed  out,  Conditions  A  and  B  do  not  define 
the  concept  of  reduction.  A  complete  analysis  of  this  notion  would  un- 
doubtedly formulate  considerably  more  restrictive  conditions  than  ours 
on  the  concept.  In  fact,  our  conditions  are  so  weak  that  for  any  two 
correct  theories  it  is  possible  to  construct  a  trivial  relation  'reducing'  one 
to  the  other  satisfying  Conditions  A  and  B.  Nagel  [8]  and  Bergmann  [2] 
have  discussed  some  further  restrictions  informally.  However,  it  is  worth 
observing  that  conditions  much  like  our  A  and  B  are  central  to  both  of 
their  analyses. 

4.  Reduction  of  RBM  to  PM.  The  reduction  relation  relating  RBM  and 
PM  can  be  defined  by  simply  formalizing  the  descriptions  of  the  intended 
interpretations  of  the  primitive  notions  of  RBM  in  terms  of  the  concepts 
of  PM,  as  outlined  in  Section  2.  The  precise  formalization  of  this  defi- 
nition is  too  lengthy  to  be  included  here,  and  we  shall  only  sketch  its 
main  features.  Let  R  be  the  reduction  relation;  it  is  necessary  to  specify 
when  R  holds  between  an  ordered  septuple  F  —  </£,  T,  g,  R,  II,  p,  Oy 
and  an  ordered  quintuple  A  =  <P,  T,  m,  S,  F>.  In  the  intended  inter- 
pretation of  K,  it  is  assumed  that  the  elements  of  K  are  composed  of 
particles  —  i.e.,  that  each  rigid  body  is  a  set  of  particles.  This  requirement 
can  be  formulated  by  imposing  the  condition  that  if  F has  relation  R  to  A, 
then  K  must  be  a  partition  of  P\  i.e.,  the  particles  composing  P  can  be 
separated  into  sets  which  'compose'  the  rigid  bodies  in  K.  In  addition 
to  the  requirement  that  K  be  a  partition  of  P,  it  is  also  necessary  to 
impose  the  requirement  that  the  particles  which  form  a  particular 
element  of  K  maintain  constant  mutual  distances :  that  is,  if  p  and  q  are 
both  elements  of  k,  then  for  all  t  in  T, 

\S(p,t)-S(q,t)\ 

is  a  constant. 

Not  only  must  the  rigid  bodies  k  be  composed  of  elements  of  P,  but  the 
mass,  position,  force,  moment  of  inertia,  and  angular  position  functions 
of  F  must  have  the  proper  relations  to  the  mass,  position,  and  force 
functions  of  A.  For  example,  the  mass  g(k]  of  rigid  body  k  must  the  sum 
of  the  masses  of  the  particles  composing  it.  Hence,  a  condition  in  the 


FOUNDATIONS   OF    RIGID    BODY   MECHANICS  263 

definition  of  R  must  be  that  if  jT  has  relation  R  to  A,  then  for  all  k  in  K, 

g(k)=^m(p). 

pek 

Similarly,  if  R(k,  t)  is  to  represent  the  position  of  the  center  of  mass  of  k 
at  time  /,  it  must  be  required  that  for  all  k  in  K  and  t  in  T, 


pek 

Similar  conditions  relate  the  remaining  functions  Ht  //  and  6  of  F  to  the 
functions  m,  S  and  F  of  A. 

With  the  relation  R  defined,  it  is  possible  to  ask  whether  or  not  it 
satisfies  Conditions  A  and  B  given  in  Section  3.  Condition  A  requires  that 
every  intended  model  of  RBM  has  relation  R  to  some  intended  model  of 
PM.  The  intended  models  of  RBM  are  systems  of  rigid  bodies,  and  those 
of  PM  are  systems  of  particles.  That  a  system  of  rigid  bodies  has  relation 
R  to  a  system  of  particles  means  that  the  rigid  bodies  in  the  first  system 
are  composed  of  the  particles  in  the  second  system,  that  the  masses  of  the 
rigid  bodies  are  equal  to  the  sums  of  the  masses  of  the  particles  composing 
them,  and  that  the  other  functions  of  the  rigid  body  system  have  the 
proper  relations  to  the  mass,  position,  and  force  functions  of  the  particle 
system.  Clearly  whether  or  not  Condition  A  is  satisfied,  depends  on  the 
empirical  hypothesis  that  all  rigid  bodies  are  composed  of  particles  which 
move  about  as  though  fixed  in  rigid  frames,  and  the  sum  of  whose  masses 
is  equal  to  the  mass  of  the  body. 

The  determination  of  whether  or  not  Condition  B  is  satisfied  does  not 
raise  any  empirical  questions.  In  fact,  it  can  be  shown  logically  that  if  a 
system  A  —  <7J,  T,  m,  S,  F>  satisfies  the  axioms  of  PM,  and  if  T  = 
<7£,  T,  g,  R,  H,  p.,  Oy  has  relation  7?  to  A,  then  F  satisfies  the  axioms  of 
RBM.  This  is  essentially  what  is  proven  in  the  usual  'derivations'  of  the 
laws  of  RBM  from  those  of  PM  given  in  text  books,  and  this  is  equivalent 
to  a  proof  that  Condition  B  is  satisfied. 

Hence  it  can  be  rigorously  proven  that  whether  relation  R  actually 
gives  a  reduction  of  RBM  to  PM  depends  on  whether  the  empirical 
assumptions  involved  in  Conditions  A  are  correct.  If  they  are  not,  then 
RBM  has  not  been  reduced  to  PM,  and  the  usual  deduction  of  the  laws 
of  RBM  from  those  of  PM  is  invalid.  If,  for  example,  there  were  a  rigid 
body  not  composed  of  particles,  then  it  is  clear  that  nothing  could  be 


264  ERNEST   W.    ADAMS 

deduced  about  its  behavior  from  the  laws  of  particle  mechanics,  since 
those  laws  only  describe  the  behavior  of  particles.  6 

The  empirical  question  here  raised  is  a  very  difficult  one,  and  involves 
in  addition  the  problem  of  clarifying  the  rather  vague  notion  of  a  particle. 
It  may  be  observed  that  the  molecular  theory  lends  support  to  the  hypo- 
thesis that  rigid  bodies  are  composed  of  entities  small  enough  to  ap- 
proximate the  point-particles  required  in  the  derivation  of  the  laws  of 
RBM,  and  the  theory  of  solids  indicates  that  these  molecules  remain 
relatively  fixed  within  rigid  bodies.  However,  the  facts  that  molecules 
only  approximate  point-particles,  and  that  they  are  not  perfectly  rigidly 
fixed  within  the  bodies  they  compose,  shows  that  the  deduction  of  the 
laws  of  RBM  from  those  of  PM  depends  on  an  hypothesis  which,  taken 
exactly,  is  false.  The  necessary  revisions  are,  however,  complicated,  and 
are,  in  any  case,  beyond  the  scope  of  this  paper. 


Bibliography 

[1]    ADAMS,  E.  W.,  Axiomatic  Foundations  of  Rigid  Body  Mechanics.  Unpublished 

Ph.  D.  dissertation,  Stanford  University,  1955. 

[2]    BERGMANN,  G.,  Philosophy  of  Science.  Madison  1957,  XII  -f-  181  pp. 
[3]    Joos,  G.,  Theoretical  Physics.  Translated  by  I.  Freeman.  New  York  1934, 

XXIII  +  748  pp. 
[4]    McCoNNELL,  A.  J.,  Applications  of  the  Absolute  Differential  Calculus.  London 

1931,  XII  H-  318pp. 
[5]    McKiNSEY,  J.  C.  C.,  A.  C.  SUGAR  and  P.  SUPPES,  Axiomatic  Foundations  of 

Classical  Particle  Mechanics.  Journal  of  Rational  Mechanics,  vol.  2  (1953), 

pp.  253-272. 
[6]    McKiNSEY,  J.  C.  C.  and  P.  SUPPES,  Philosophy  and  the  axiomatic  foundations 

of  physics.  Proceedings  of  the  Xlth  International  Congress  of  Philosophy,  vol. 

VI  (1953),  pp.  49-53. 

[7]    MILNE,  E.  A.,  Vectorial  Mechanics.  New  York  1948,  XII  -f  382  pp. 
[8]    NAGEL,  E.,  The  meaning  of  reduction  in  the  natural  sciences.  Reprinted  in 

Readings  in  Philosophy  of  Science,  P.  P.  Wiener  editor,  New  York  (1953),  pp. 

531-549. 

6  It  may  be  objected  that  the  derivation  of  the  laws  of  rigid  motion  includes  the 
motions  of  rigid  bodies  with  continuous  mass  distributions.  The  properties  of  bodies 
with  continuous  distributions  are,  however,  derived  from  the  laws  of  continuum 
mechanics  (see,  e.g.,  Noll,  W.  [9]  in  this  volume),  which  is  not  a  branch,  but  an 
extension  of  particle  mechanics. 


FOUNDATIONS   OF   RIGID    BODY   MECHANICS  265 

[9]  NOLL,  W. ,  The  foundations  of  classical  mechanics  in  the  light  of  recent  advances 
in  continuum  mechanics.  This  volume,  pp.  226-281. 

[10]  RUBIN,  H.  and  P.  SUPPES,  Transformations  of  systems  of  relativistic  particle 
mechanics.  Pacific  Journal  of  Mathematics,  vol.  4  (1954),  pp.  563-601. 

[1 1]  TARSKI,  A.,  The  Concept  of  Truth  in  Formalized  Languages.  In  Logic,  Seman- 
tics, Metamathcmatics  by  A.  Tarski,  translated  by  J.  H.  Woodger.  Oxford 
1956,  pp.  152-278. 

[12]  WHITTAKER,  E.  T.,  A  Treatise  on  the  Analytical  Dynamics  of  Particles  and 
Rigid  Bodies,  4th  ed.  New  York  1944,  XIV  +  456  pp. 


Symposium  on  the  Axiomatic  Method 


THE  FOUNDATIONS  OF  CLASSICAL  MECHANICS 

IN  THE  LIGHT  OF  RECENT  ADVANCES 

IN  CONTINUUM  MECHANICS  1 

WALTER  NOLL 

Carnegie  Institute  of  Technology,  Pittsburgh,   Pennsylvania,   U.S.A. 

1.  Introduction.  It  is  a  widespread  belief  even  today  that  classical 
mechanics  is  a  dead  subject,  that  its  foundations  were  made  clear  long 
ago,  and  that  all  that  remains  to  be  done  is  to  solve  special  problems.  This 
is  not  so.  It  is  true  that  the  mechanics  of  systems  of  a  finite  number  of 
mass  points  has  been  on  a  sufficiently  rigorous  basis  since  Newton.  Many 
textbooks  on  theoretical  mechanics  dismiss  continuous  bodies  with  the 
remark  that  they  can  be  regarded  as  the  limiting  case  of  a  particle  system 
with  an  increasing  number  of  particles.  They  cannot.  The  erroneous  belief 
that  they  can  had  the  unfortunate  effect  that  no  serious  attempt  was 
made  for  a  long  period  to  put  classical  continuum  mechanics  on  a  rigorous 
axiomatic  basis.  Only  the  recent  advances  in  the  theory  of  materials 
other  than  perfect  fluids  and  linearly  elastic  solids  have  revived  the  interest 
in  the  foundations  of  classical  mechanics.  A  clarification  of  these  foun- 
dations is  of  importance  also  for  the  following  reason.  It  is  known  that 
continuous  matter  is  really  made  up  of  elementary  particles.  The  basic 
laws  governing  the  elementary  particles  are  those  of  quantum  mechanics. 
The  science  that  provides  the  link  between  these  basic  laws  and  the  laws 
describing  the  behavior  of  gross  matter  is  statistical  mechanics.  At  the 
present  time  this  link  is  quite  weak,  partly  because  the  mathematical 
difficulties  are  formidable,  and  partly  because  the  basic  laws  themselves 
are  not  yet  completely  clear.  A  rigorous  theory  of  continuum  mechanics 
would  give,  at  least  some  precise  information  on  what  kind  of  gross 
behavior  the  basic  laws  ought  to  predict. 

I  want  to  give  here  a  brief  outline  of  an  axiomatic  scheme  for  continuum 
mechanics,  and  I  shall  attempt  to  introduce  the  same  level  of  rigor  and 
clarity  as  is  now  customary  in  pure  mathematics.  The  mathematical 

1  The  results  presented  in  this  paper  were  obtained  in  the  course  of  research 
sponsored  by  the  U.S.  Air  Force  Office  of  Scientific  Research  under  contract  no. 
AF  18  (600)-1138  with  Carnegie  Institute  of  Technology. 

266 


FOUNDATIONS   OF   CONTINUUM    MECHANICS  267 

structures  involved  are  quite  complex,  and  some  fine  details  have  to  be 
emitted  in  order  not  to  overburden  the  paper  with  technicalities. 

Notation:  Points  and  vectors  in  Euclidean  space  will  be  indicated 
by  bold  face  letters.  If  x  and  y  are  two  points,  then  x  —  y  denotes  the 
sector  determined  by  the  ordered  pair  (y,  x).  If  x  is  a  point  and  v  a  vector, 
then  x  +  v  denotes  the  point  uniquely  determined  by  (x  +  v)  —  x  =  v. 
The  word  "smooth"  will  be  used  instead  of  "continuously  cliff  crentiable". 
Some  equations  will  be  valid  only  up  to  a  set  of  measure  zero.  It  will  be 
~lear  from  the  context  when  this  is  the  case. 

2.  Bodies. 

DEFINITION  1  :     A  BODY  is  a  set  93  endowed  with  a  structure  defined  by 

(a)  a  set  0  of  mappings  of  93  into  a  three-dimensional  Euclidean  point 
space  E,  and 

(b)  a  real  valued  set  function  m  defined  for  a  set  of  subsets  of  93 

subject  to  seven  axioms  as  follows  : 

(5.1)  Every  mapping  <p  o  0  is  one-to-one. 

(5.2)  For  each  9?  e  0,  the  image  B  =  <p(93)  ts  a  region  in  the  space  E,  a 
region  being  defined  as  a  compact  set  with  piecewise  smooth  boundaries. 

(5.3)  //  <p  e  0  and   y  e  0   then  the  mapping  %  =  \p  °  (p  2  Of  y;(93)   onto 
^(93)  can  be  extended  to  a  smooth  homeomorphism  of  E  onto  itself. 

(5.4)  //  y  is  a  smooth  homeomorphism  of  E  onto  itself  and  if  y  e  0,  then 
also  £  "  99  e  0. 

These  four  axioms  give  93  the  structure  of  a  piece  of  a  diffcrentiable 
Tianifold  that  is  isomorphic  to  a  region  in  Euclidean  three-space.  The 
following  three  axioms  give  93  the  structure  of  a  measure  space. 

(M.I)   m  is  a  non-negative  measure,  defined  for  all  Borel  subsets  (£  of  93. 

_  -j 

(M.2)  For  each  y  e  0,  the  measure  fiv  =  m  °  (p  induced  by  m  on  the  region 
B  —  (^(93)  in  space  is  absolutely  continuous  relative  to  the  Lebesgue 
measure  in  B.  Hence  it  has  a  density  p9  so  that 

(2.1)  m(C)  =  /p, 


(M.3)  For  each  (p  E  0  the  density  p9  is  positive  and  bounded. 

2  The  symbol  o  denotes  the  composition  of  mappings  and  a  superposed  —  1 
ienotes  the  inverse  of  a  mapping. 


268  WALTER    NOLL 

We  use  the  following  terminology :  The  elements  X,  Y,  ...  of  93  are  the 
PARTICLES  of  the  body.  The  mappings  y  e  0  are  the  CONFIGURATIONS  of 
the  body.  The  point  x  =  <p(X]  is  the  POSITION  of  the  particle  X  in  the 
configuration  <p.  The  set  function  m  is  the  MASS  DISTRIBUTION  of  the  body. 
The  number  m(&)  is  the  MASS  of  the  set  (£.  Here  and  subsequently  we 
refer  to  Borel  sets  simply  as  sets.  The  density  pv  is  the  MASS  DENSITY  of  93 
in  the  configuration  (p.  Note  that  it  would  have  been  sufficient  to  require 
the  existence  of  p9  only  for  one  particular  configuration  (p.  It  then  follows 
that  the  mass  density  exists  also  for  all  other  configurations. 

A  compact  subset  $  of  93  with  piecewise  smooth  boundaries  will  be 
called  a  PART  of  the  body  93.  It  may  again  be  regarded  as  a  body  whose 
configurations  are  the  restrictions  to  $  of  the  configurations  of  93  and 
whose  mass  distribution  is  the  restriction  of  the  mass  distribution  of  93 
to  the  subsets  of  $.  Two  parts  $  and  D  will  be  called  SEPARATE  if 

$  n  €t  C  «jj  *  &, 
where  s$  denotes  the  boundary  of  s$. 

3.  Kinematics 

DEFINITION  2:  A  MOTION  of  a  body  93  is  a  one-parameter  family  {Ot}, 
—  oo  <  t  <  oo,  of  configurations  Ot  e  0  of  93  such  that 

(/C.I)    The  derivative 

(3.1)  v(X,t)=-^Ot(X) 

exists  for  all  X  e  93  and  all  t,  it  is  a  continuous  function  of  X  and  t 
jointly,  and  it  is  a  smooth  function  of  X. 
(K.2)    The  derivative 

(3.2)  v(X.  t)=^.  v(X,  t)=^  Ot(X) 

exists  piecewise  and  is  piecewise  continuous  in  X  and  t  jointly. 

The  parameter  t  is  called  the  TIME.  Derivatives  with  respect  to  t  will  be 
denoted  by  superposed  dots.  v(X,  t)  is  called  the  VELOCITY  of  the  particle 
X  at  time  t.  v(Xt  t)  is  called  the  ACCELERATION  of  X  at  t. 

Let  h  be  any  real,  vector,  or  tensor  valued  function  of  X  and  tt  and 
assume  that  h(X,  t)  is  smooth  in  X  and  t  jointly.  We  may  then  associate 


FOUNDATIONS   OF   CONTINUUM    MECHANICS  269 

with  h  the  function  fi  defined  by 

(3.3)  K(*,t)=h(e](*),t) 

for  —  oo  <  t  <  oo  and  x  e  0^(93).  By  the  chain  rule  of  differentiation  we 
have 

(3.4)  h(X,  t)  =  &(6t(X),  t)  +  Pk(Ot(X),  t)-v(X,  t), 

where  Vn  denotes  the  gradient  of  h  with  respect  to  x.  It  is  customary  in 
the  literature  to  use  the  same  symbol  for  h  and  h,  to  omit  the  independent 

variables,  and  to  distinguish  h  from  h  by  writing  h  —  —  .  Equation  (3.4) 
then  takes  the  familiar  form 

(3.5)  h  =  ^+v.gr*dh. 

The  LINEAR  MOMENTUM  at  time  t  of  a  set  (£  C  93  is  defined  by 
(3.6) 


It  follows  from  (K.I)  and  (K.2)  that  g((£,  t)  is  piccewise  smooth  in  t.  As  a 
function  of  E  it  is  a  vector  valued  measure,  absolutely  continuous  relative 
to  m  with  density  i>. 

The  ANGULAR  MOMENTUM  at  time  t  of  a  set  &  C  93,  relative  to  a  point 
O  6  E,  is  defined  by 

(3.7)  fc(g;  t\  O)  =f[Ot(X)  -  O]  x  v(X,  t)dm. 

(£ 

It  is  piccewise  smooth  in  t,  and,  as  a  function  of  (£,  it  is  a  vector  valued 
measure. 

4.  Forces 

DEFINITION  3:     A  SYSTEM  OF  BODY  FORCES  for  a  body  S3  is  a  family 
{B^}  of  vector  valued  set  functions  subject  to  the  following  axioms  : 

(B.I)    For  each  part  ty  of  93,  B^  is  a  vector  valued  measure  defined  on  the 

Borel  subsets  of  ty. 
(B.2)    For  each  *$,  B<%  is  absolutely  continuous  relative  to  the  mass  distri- 

bution m  of  ^5.  Hence  it  has  a  density  b^  so  that 

(4.1)  BSJJ(e) 


270  WALTER   NOLL 

(B.3)     The  density  b^  is  bounded,  i.e. 

\b#(X)\  <k  <oo, 
where  k  is  independent  of  ^5  and  X  E  <$. 

DEFINITION  4:     A  SYSTEM  OF  CONTACT  FORCES  for  a  body  93  is  a  family 
{C^}  of  vector  valued  set  functions  subject  to  the  following  axioms  : 

(C.I)     For  each  part  ^  of  93,  C^  is  a  vector  valued  measure  defined  on  the 

Borel  subsets  of  s$. 
(C.2)     C^(£)_-C^(S^). 
(C.3)     //  c  C  a  c  C  $,  and  $  C  O,  then 

C*(c)  =  CG(c). 

(C.4)  If  (p  e0  is  any  configuration  of  93  0w^  if  P  —  <p(s$),  /Atfw  the  induced 
measure  C^  o  qpt  when  restricted  to  the  Borel  subsets  of  the  boundary 
surface  P  of  P  =  <?(?$),  is  absolutely  continuous  relative  to  the  Le- 
bcsgue  surface  measure  on  P.  Hence  it  has  a  density  s(*$,  (p)  so  that 

(4.2) 


for  all  Borel  siibsets  c  C  ^. 
(C.5)     The  density  $(*$,  q>)  is  bounded,  i.e. 

$(*$,  <p\x)\  <  /  <  oo, 
where  I  does  not  depend  on  $  or  x  e 


As  in  the  case  of  a  mass  distribution,  it  would,  have  been  sufficient  in 
(C.4)  to  require  the  existence  of  s(^5,  (p)  only  for  a  particular  <p  e  0.  The 
existence  of  s  for  all  other  configurations  is  then  an  automatic  consequen- 
ce. The  axiom  (C.2)  means  that  C^  is  essentially  a  vector  measure  on  the 
boundary  $• 

It  is  useful  to  consider  surfaces  in  93  as  being  oriented,  and  to  employ 
the  operation  of  addition  of  oriented  surfaces  in  the  sense  of  algebraic 
topology.  The  boundary  ^5  of  a  part  *(J  of  95  will  be  regarded  as  oriented 
in  such  a  way  that  the  positive  side  of  s$  is  exterior  to  $.  If  ^5  and  jQ 
are  two  separate  parts  of  93,  then 


(4.3)  $  w  O  =  %  +  a 

This  is.  true  because  the  common  boundary  of  ^J  and  d,  if  any,  appears 


FOUNDATIONS    OF   CONTINUUM    MECHANICS  271 

twice  with  opposite  orientation  on  the  right  side  of  (4.3)  and  hence 
cancels.  We  shall  say  that  the  surface  c  is  a  PIECE  of  the  surface  b  if  c  is  a 
subset  of  b  and  if  the  orientation  of  c  is  induced  by  b.  The  significance  of 
the  axiom  (C.3)  is  brought  out  by  the  following  theorem: 

THEOREM  I  :  There  is  a  vector  valued  junction  S,  defined  for  all  oriented 
surfaces  c  in  93,  such  that 

(4.4)  tyc)  =  S(c) 

whenever  c  is  a  piece  of  the  boundary  *$  of  $.  We  say  that  £(<:)  is  the  CONTACT 

FORCE  ACTING  ACROSS  THE  ORIENTED  SURFACE  C. 

Proof:  For  each  c  which  is  not  a  piece  of  —  S3  we  can  find  a  part 
£t(c)  of  $  such  that  c  is  a  piece  of  &(c).  We  then  define  5(c)  =  CO(c)(c). 
Now  let  s$  be  an  arbitrary  part  of  93  and  let  c  be  a  piece  of  s$.  We  then 
have 

cC$,     cCO(cj,     c  C  d(c)  ^  $, 

¥^Q(c)C$, 
Applying  axiom  (C.3)  twice,  we  get 

Hence 


=  C0(e)(c)  =  S(t). 
If  C  is  a  part  of  —  S3  we  define 
(4.5)  5(c)  =  -  S(-  c). 

It  follows  from  theorem  I  and  axiom  (C.4)  that  there  is  a  vector  valued 
function  $(c,  <p;  x)  such  that 


(4.6)  S(c)  =fs(ct<p; 

T(C) 

Also,  if  x  6  <p(b)  C  <p(c)  and  if  b  is  a  piece  of  c,  then 

(4.7)  «s(c,  y;  x)  =  s(b,  y,x). 

If  Ci  and  02  are  two  pieces  of  a  surface  c  and  if  c  =  Ci  +  C2,  then 

(4.8)  S(c)  -  5(ci)  +  5(c2). 

This  is  true  because  Cs^,  as  a  measure,  is  additive  and  because,  by  axiom 
(C.4)  the  value  of  C^  for  the  common  boundary  curve  of  Ci  and  02  is  zero. 


272  WALTER   NOLL 

DEFINITION  5:  A  SYSTEM  OF  FORCES  for  a  body  93  is  a  family  of  vector 
valued  measures  {F^}  such  that,  for  each  part  $  of  93,  F^  is  defined  on  the 
subsets  of  ty  and  such  that  the  F^  have  decompositions 

(4.9)  Fw  =  By  +  Cy, 

where  {B^}  is  a  system  of  body  forces  and  {C<$}  is  a  system  of  contact  forces. 


It  is  not  hard  to  sec  that  the  decomposition  (4.9),  if  it  exists,  is  auto- 
matically unique. 

We  use  the  following  terminology:  The  measure  F^  is  the  FORCE  acting 
on  the  part  *$  of  93.  The  vector  F^SJJ)  is  the  RESULTANT  FORCE  acting  on 
$.  Let  $  and  jQ  be  two  separate  parts  of  93.  The  vector  measure 

(4-10)  *».c  =  *»  -  **uc 

defined  on  the  subsets  of  s$,  is  the  MUTUAL  FORCE  exerted  on  ^  by  C,. 
The  mutual  force  exerted  on  a  part  *JS  of  93  by  the  closure  of  its  comple- 
ment is  denoted  by  F(^  and  it  is  called  the  INTERNAL  FORCE  acting  on  ^. 
The  restriction  of  F%  to  a  part  ^J  of  93  is  the  EXTERNAL  FORCE  acting  on  *JJ. 
A  similar  terminology  and  notation  will  be  used  when  "force"  is  replaced 
by  "body  force"  or  by  "contact  force". 

Let  {FqJ  be  a  system  of  forces  for  a  body  93,  y  e  0  a  configuration  of  $, 
and  O  E  E  a  point  in  space.  The  MOMENT  about  O  of  the  force  F^  acting 
on  the  part  s$  of  93  in  the  configuration  <p  is  the  vector  valued  measure 
y,  O)  defined  by 


(4.  1  1)  M(F#,  ?,  O  ;<£)=/  MX)  -  O]  X  dF,p 

<$ 

for  the  subsets  E  of  ty.  The  vector  M(F^,  9?,  O;  s$)  is  the  RESULTANT 
MOMENT  about  O  acting  on  ^J. 

5.  Dynamical  processes 

DEFINITION  6:  A  DYNAMICAL  PROCESS  is  a  triple  {93,  Ot,  F^v},  where  93 
is  a  body,  Ot  is  a  motion  of  93,  and  F^it  is  a  one-parameter  family  of  systems 
of  forces  for  93,  subject  to  the  following  two  axioms'. 

(D.I)    Principle  of  linear  momentum'.  For  all  parts  ^5  of  93  and  all  times  t, 


where  g  is  defined  by  (3.6).  In  words:  The  resultant  force  acting  on 

the  part  ^J  is  equal  to  the  rate  of  change  of  the  linear  momentum  of  ^J. 

(D.2)    Principle  of  angular  momentum:  Let  O  G  E  be  any  point  in  space. 


FOUNDATIONS    OF   CONTINUUM    MECHANICS  273 

Then  for  all  parts  *J5  of  93  and  all  times  t, 

(5.2)  Af(Fw,  Ot,  O;  $)  =  fc($;  *;  O)f 


where  h  and  M  are  defined  by  (3.7)  and  (4.  1  1  ),  respectively.  In  words  : 
The  resultant  moment  about  O  acting  on  a  part  ty  is  equal  to  the  rate 
of  change  of  the  angular  momentum  of  ty  relative  to  O. 

It  would  have  been  sufficient  to  require  that  (5.2)  holds  for  a  particular 
O  e  E.  It  is  then  automatically  valid  for  all  points  in  space.  Also,  (5.2) 
remains  valid  if  the  fixed  point  O  is  replaced  by  the  variable  mass  center 

(5.3)  c($,  t)  =  O  +  ~i-  f  (Ot(X)  -  0)dm 

m         j 


of  the  part  *(J.  These  statements  can  be  proved  in  the  classical  manner. 
We  now  prove  a  number  of  important  theorems.  For  simplicity  we 
omit  the  variable  t',  we  write 

(5.4)  *(c;«)  =  s(t,0t',x\ 

for  the  density  of  the  contact  force  as  defined  by  (4.6). 

THEOREM  II  :     For  any  two  separate  parts  $  and  d  of  93  we  have 

(5.5)  *».o(*)  =  -  fo.*(&) 

i.e.  the  resultant  mutual  force  exerted  on  *J3  by  d  is  equal  and  opposite  to  the 
resultant  mutual  force  exerted  on  d  by  *JJ. 

Proof:     We  apply  axiom  (D.I)  to  $,  £},  and  $  w  O: 


(5.6)       FyW)  =  g(%),  F0(0)  =  g(Q),  F^u0(*  w  O)  = 
Since  ^J  r»  Q,  has  no  mass  by  (M.2),  it  follows  from  (3.6)  that 


hence,  by  (5.6), 

It  is  not  hard  to  see  that  F¥uO(?P  <-.  £})  =  0.  Hence 

^uot*  -  O)  =  FWo0($)  +  ^U0( 
The  assertion  follows  now  from  the  definition  (4.10). 


274  WALTER    NOLL 

THEOREM  III  (reaction  principle)  3:  The  contact  force  5(c)  acting  cross 
C  is  opposite  to  the  contact  force  acting  across  —  c,  i.e. 

(5.7)  S(c)  =  -  S(-  c) 

Proof:  If  c  is  a  piece  of  —  33,  then  (5.7)  is  true  by  the  definition  (4.5). 
If  not,  it  is  possible  to  find  two  separate  parts  $  and  d  such  that 
$  r>  D  =  c  (see  Fig.  1).  We  orient  c  such  that  it  is  a  piece  of  *$.  Then  —  c 

K. 


r y 


will  be  a  piece  of  O.  The  surfaces  ^5,  O,  and  ^5  ^  d  have  decompositions 

^  =  c  +  b,    JQ  =  (—  c)  +  e,     $^d  =  b  +  e. 
It  follows  from  theorem  I  and  (4.8)  that 


and  hence  that 

<W¥)  =  C,(¥)  -  C¥u0(*)  =  5(c). 
Similiarly,  we  obtain 


For  the  total  resultant  mutual  forces,  we  get 


.  (-  c). 

Application  of  theorem  II  gives 
(5.9)  5(c)  +  5(-  c)  =  -[Bw.0(¥) 

Using  axiom  (M.3)  one  can  show  that  the  parts  ^  and  d  can  be  chosen 

3  Various  statements,  mostly  quite  vague,  pass  under  the  title  "principle  of 
action  and  reaction"  in  the  literature.  All  of  these  statements,  when  made  precise, 
are  provable  theorems  in  the  theory  presented  here. 


FOUNDATIONS   OF   CONTINUUM   MECHANICS  275 

such  that  their  masses  w(*J5)  and  w(&)  are  arbitrarily  small.  Axiom  (B.3) 
then  implies  that  the  right  side  of  (5.9)  can  be  made  arbitrarily  small  in 
absolute  value.  It  follows  that  the  left  side  of  (5.9)  must  vanish.  Q.e.d. 
As  a  corollary,  it  follows  that 

(5.10)  5(ci  +  C2)  =  5(ci)  +  5(ca), 


no  matter  whether  Ci  and  C2  are  pieces  of  c  =  Ci  +  C2,  as  in  (4.8),  or  not. 
Hence  S  may  be  regarded  as  an  additive  vector  valued  function  of 
oriented  surfaces  in  93.  Another  corollary  is  that  the  statement  of  theorem 
II  remains  true  if  "mutual  force*'  there  is  replaced  by  "mutual  contact 
force"  or  by  "mutual  body  force". 

THEOREM  IV  (stress  principle)  4  :     There   is   a   vector   valued  function 
s(x,  ri),  where  x  e  0$(93)  and  where  n  is  a  unit  vector,  such  that 

(5.11)  *(c;*)  =s(x,n) 

whenever  0*(c)  has  the  unit  normal  n  at  x  e  0$(c),  directed  towards  the  positive 
side  of  the  oriented  surface  0$(c),  the  orientation  of  0$(c)  being  induced  by  the 
orientation  of  c. 

Proof:     Let  Ci  and  C2  be  two  surfaces  in  93  tangent  to  each  other  at 

-i 

X  =  Ot(x).  The  surfaces  c\  =  0$(Ci)  and  C2  =  0j(C2)  in  space  E  are  then 
tangent  to  each  other  at  the  point  x.  We  assume  that  n  is  their  unit 
normal  at  x  and  that  Ci  and  C2  are  oriented  in  such  a  way  that  n  is  directed 


Fig.  2 

toward  the  positive  side  of  c\  and  c%.  Consider  the  region  PI  bounded  by  a 
piece  di  of  ci,  a  piece  of  a  circular  cylinder  /  of  radius  r  whose  axis  is  n 

4  The  assertion  of  this  theorem  appears  in  all  of  the  past  literature  as  an  assump- 
tion. It  has  been  proposed  occasionally  that  one  should  weaken  this  assumption  and 
allow  the  stress  to  depend  not  only  on  the  tangent  plane  at  x,  but  also  on  the  curva- 
ture of  the  surface  c  at  x.  The  theorem  given  here  shows  that  such  dependence  on 
the  curvature,  or  on  any  other  local  property  of  the  surface  at  x,  is  impossible. 


276  WALTER   NOLL 

and  by  a  plane  perpendicular  to  n  at  a  distance  r  from  x  as  shown  in 
Fig.  2.  The  region  P%  is  defined  in  a  similiar  manner.  Denote  the  common 
boundary  of  PI  and  P%  on  the  cylinder  and  the  plane  by  e.  The  bounda- 
ries PI  and  P%  then  have  decompositions  into  separate  pieces  of  the  form 

(5.12)  Pi  =  <*i  +  *  +  /!,     P2  =  d2  +  e  +  f2, 

where  f\  and  /2  are  pieces  of  the  cylinder  /.  We  denote  the  surface  area  of  a 
surface  c  by  A(c)  and  the  volume  of  a  region  P  by  V(P).  It  is  not  hard  to 
see  that  then 

(5.13) 
(5.14) 
(5.15) 

-i  -i 

for  *  =  1,  2.  $1  =  Ot(Pi)  and  $2  =  Ot(P2)  will  be  parts  of  93  for  small 

enough  r,  except  when  xe  0$(93),  and  n  is  directed  toward  the  interior 
of  0$(93).  Applying  axiom  (D.I)  to  ty\  and  s$2  gives 


(5.16) 

*45i 

By  (4.  1  )  and  (4.4)  this  may  be  written  in  the  form 

(5.17)  Sfa)  =  f  (v  -  b^&m. 

s£< 

By  (4.6),  (4.7),  (4.8),  and  (5.12)  we  have 
(5.18) 


-1  -1 

where  f<  =  0«(A)»  c  =  0«(^).  Subtracting  the  two  equations   (5.18)   and 
using  (5.17),  we  get 

(5.19) 


Since  v,  b^t  and  the  mass  density  are  bounded  by  constants  independent 
of  ^S,  according  to  the  axioms  (K.2),  (B.3),  and  (M.3),  it  follows  from 
(5.15)  that 

/(v-ftl)dfw  =  o(r»),     t  =  1,2. 


FOUNDATIONS   OF   CONTINUUM   MECHANICS  277 

Similarly,  it  follows  from  axiom  (C.5)  and  from  (5.14)  that 
/*(fi)d4  =  o(r2),    i=  1,2. 

Hence,  by  (5.19), 

Ji  s(t)dA  =f  s(c2)dA  +  o(r2). 

di  da 

Dividing  by  nr2  and  using  (5.13),  we  get 


/5  on)  * -  -* 4-   °(r} 

^  '     '  A(di)        '      A(d2)      ^    nr* 

By  a  theorem  on  measures  with  density,  we  have 


lim-^—  --—  =  *(c«;  *)f    *=lf2. 

r+O           A  (di) 

Thus,  letting  r  go  to  zero  in  (5.20),  we  finally  obtain 


which  shows  that  s(c  ;  x)  has  the  same  value  for  all  surfaces  c  with  the 

-i  _ 

same  unit  normal  n.  The  exceptional  case  when  x  e  0*(93)  and  n  is 
directed  toward  the  interior  of  0^(83)  is  taken  care  of  by  the  definition 
(4.5). 

The  vector  s(xt  n)  is  called  the  STRESS  acting  at  x  across  the  surface 
element  with  unit  normal  n.  By  (4.6)  the  contact  force  5(c)  acting  across 
C  is  given  by 

(5.21)  5(c)  =fs(x,n)dA, 

ft(c) 

where  n  is  the  unit  normal  at  x  to  the  oriented  surface  0j(c).  By  theorem 
II  we  have 

(5.22)  s(x,  n)  =  —  s(x,  —  n). 

The  following  two  additional  assumptions  suffice  to  ensure  the  validity 
of  the  classical  theorems  of  continuum  mechanics: 

(a)    The  stress  s(x,  n),  for  each  n,  is  a  smooth  function  of  x  e  0*($8). 


278  WALTER   NOLL 

(b)    For  almost  all  X  e  93,  the  limit 
(5.23) 


where  *JJ  is  a  neighborhood  of  X  contracting  to  X,  exists. 
Under  these  assumptions,  one  can  prove  the  following  theorems  in 
the  classical  manner: 

(1)  There  is  a  field  of  linear  transformations  S(x),  x  e  0$(93),  such  that 

(5.24)  s(x,  n)  =  S(x)n. 

S(x)  is  called  the  STRESS  TENSOR  at  x. 

(2)  The  stress  tensor  S(x)  is  symmetric. 

(3)  Cauchy's  equation  of  motion 

(5.25)  div  5  +  pb  =  pv 

holds,  where  5  is  the  stress  tensor,  p  is  the  mass  density,  v  is  the 
acceleration,  and  b  is  defined  by  (5.23). 

6.  Equivalence  of  dynamical  processes.  The  position  of  a  particle  can  be 
specified  physically  not  in  an  absolute  sense  but  only  relative  to  a  given 
frame  of  reference.  Such  a  frame  is  a  set  of  objects  whose  mutual  distances 
change  very  little  in  time,  like  the  walls  of  a  laboratory,  the  fixed  stars, 
or  the  wooden  horses  on  a  merry-go-round.  In  classical  physics,  a  change 
of  frame  corresponds  to  a  transformation  of  space  and  time  which  pre- 
serves distances  and  time  intervals.  It  is  well  known  that  the  most 
general  such  transformation  is  of  the  form 

**  =  c(t)  +  Q(t)(x  -  O), 
(6.1) 

'  t*=t  +  a, 

where  c(t)  is  a  point  valued  function  of  t,  Q(t)  is  a  function  of  t  whose 
values  are  orthogonal  transformations,  a  is  a  real  constant,  and  O  is  a 
point,  which  may  be  fixed  once  and  for  all.  We  assume  that  c(t]  and  Q(t) 
are  twice  continuously  differentiable.  A  change  of  frame  (6.  1)  also  induces 
a  transformation  on  vectors  and  tensors.  A  vector  u,  for  example,  is 
transformed  into 

(6.2)  u*  =  Q(t)u. 

Let  {93,  Ot,  Fysti$  be  a  dynamical  process.  A  change  of  frame  [c,  Q,  a} 


FOUNDATIONS   OF   CONTINUUM    MECHANICS  279 

will  transform  the  motion  Ot  into  a  new  motion  Ot'  defined  by 

(6.3)  Otf(X)  =  c(t  -a)  +  Q(t  -  a)[Ot-a(X)  -  O]. 

The  velocities  and  the  accelerations  of  the  two  motions  Ot  and  Ot  are,  in 
general,  not  related  by  the  transformation  formula  (6.2)  for  vectors.  They 
depend  on  the  choice  of  the  frame  of  reference.  We  say  that  they  are  not 
objective.  However,  there  are  objective  kinematical  quantities,  for  ex- 
ample the  rate  of  deformation  tensor. 

If  we  wish  to  assume  that  forces  have  an  objective  meaning  we  would 
have  to  require  that  F<^((£)  transforms  according  to  the  law  (6.2)  under 
a  change  of  frame.  However,  when  this  assumption  is  made,  a  dynamical 
process  does  not  transform  into  a  dynamical  process  because  the  axioms 
(D.I)  and  (D.2)  are  not  preserved,  except  when  c  is  linear  in  /  and  Q  is 
constant.  It  is  this  difficulty  which  has  led  to  the  concept  of  absolute 
space  and  which  has  caused  much  controversy  in  the  history  of  mechanics. 
A  clarification  was  finally  given  by  Einstein  in  his  general  theory  of 
relativity,  in  which  gravitational  forces  and  inertial  forces  cannot  be 
separated  from  each  other  in  an  objective  manner.  If  we  wish  to  stay  in 
the  realm  of  classical  mechanics  we  may  resolve  the  paradox  by  sacrificing 
the  objectivity  of  the  external  bod)7  forces  while  retaining  the  objectivity 
of  the  essential  types  of  forces,  the  contact  forces  and  the  mutual  body 
forces.  This  can  be  done  by  assuming  that  the  forces  transform  according 
to  a  law  of  the  form 

(6.4)  F'W(C)  =  Q(t  -  a)Fw_a((E)  +  !(<£,  t). 

Here  /(g,  t)  will  be  called  the  INERTIAL  FORCE  acting  on  g  due  to  the 
change  of  frame  {c,  Q,  a}. 

DEFINITION  7 :  Two  dynamical  processes  (93,  Ot,  F^J  and  (93,  Ot',  F'^,} 
are  called  EQUIVALENT  if  there  is  a  change  of  frame  {c,  Q,  a}  such  that  Ot'  and 
F'yj  are  related  to  Ot  and  F^tt  by  (6.3)  and  (6.4). 

The  classical  analysis  of  relative  motion  shows  that  the  inertial  force 
/((£,  t)  is  necessarily  of  the  form 

(6.5)  /(«,/)  =fi(X,t)dm 

(£ 

with 

(6.6)  i(X,  t)  =  c(t  -a}+  2V(t  -  a)[v'(X,  t)  -  c(t  -  a)] 

+  [V*(t  -  a)  -  V  (t  -  a)][Otf(X)  -  c(t  -  a)] 


280  WALTER    NOLL 

where  v'  is  the  velocity  of  the  motion  0/,  and  where  V(t)  is  defined  by 
(6.7)  V(t)  =  Q(t)Q(t)-\ 

It  is  not  hard  to  see  that  the  inertial  force  /  gives  a  contribution  only  to 
the  external  body  forces  and  that  the  contact  forces  and  the  mutual 
forces  transform  according  to  (6.2)  and  hence  are  objective.  The  external 
body  forces  and  the  inertial  forces  cannot  be  separated  from  each  other  in 
an  objective  manner.  Experience  shows  that,  for  the  body  consisting 
of  the  entire  solar  system,  there  are  frames  relative  to  which  the  external 
body  forces  nearly  vanish.  These  are  the  classical  Galilean  frames.  Two 
equivalent  dynamical  processes  really  correspond  to  the  same  physical 
process,  viewed  only  from  two  different  frames  of  reference. 

7.  Constitutive  assumptions.  An  axiom  that  characterizes  the  particular 
material  properties  of  a  body  is  called  a  CONSTITUTIVE  ASSUMPTION.  It 
restricts  the  class  of  dynamical  processes  the  body  can  undergo.  A 
familiar  example  is  the  assumption  that  the  body  is  rigid.  It  restricts  the 
possible  motions  to  those  in  which  the  distance  between  any  two  particles 
remains  unchanged  in  time.  More  important  for  modern  continuum 
mechanics  are  constitutive  assumptions  in  the  form  of  functional  re- 
lations between  the  stress  tensor  S  and  the  motion  Ot.  Such  relations  are 
called  CONSTITUTIVE  EQUATIONS  (sometimes  also  rheological  equations  of 
state  or  stress-strain  relations).  A  classical  example  is  the  constitutive 
equation  for  linear  viscous  fluids 

(7.1)  S  =  (—  £  +  A  tr  D)I  +  2/*D, 

where  D  is  the  rate  of  deformation  tensor,  /  is  the  unit  tensor,  p  is  the 
pressure,  and  A  and  fi  are  viscosity  constants.  A  wide  variety  of  consti- 
tutive equations  have  been  investigated  in  recent  years  5,  and  a  general 
theory  of  such  equations  has  been  developed  [2]. 

Constitutive  assumptions  are  subject  to  a  general  restriction: 

PRINCIPLE  OF  OBJECTIVITY:  //  a  dynamical  process  is  compatible  with 
a  constitutive  assumption  then  all  processes  equivalent  to  it  must  also  be 
compatible  with  this  constitutive  assumption.  In  other  words,  constitutive 
assumptions  must  be  invariant  under  changes  of  frame. 

This  principle,  although  implicitly  used  by  many  scientists  in  the  his- 
tory of  mechanics,  was  stated  explicitly  first  by  Oldroyd  [3]  and  was 

5  A  review  of  the  literature  and  a  bibliography  is  given  in  [1]. 


FOUNDATIONS    OF   CONTINUUM    MECHANICS  281 

clarified  further  by  the  author  [4].  It  is  of  great  importance  in  the  theory 
of  constitutive  equations. 

8.  Unsolved  problems.  The  axiomatic  treatment  given  here  is  still  too 
special.  It  does  not  cover  concentrated  forces,  contact  couples  and  body 
couples,  sliding,  impact,  rupture,  and  other  discontinuities,  singularities, 
and  degeneracies.  It  would  be  desirable  to  have  a  universal  scheme  which 
covers  any  conceivable  situation. 

A  more  fundamental  physical  problem  is  to  find  a  rigorous  unified 
theory  of  continuum  mechanics  and  thermodynamics.  Classical  thermo- 
dynamics deals  only  with  equilibrium  states  and  hence  is  not  adequate  tor 
processes  with  fast  changes  of  state  in  time.  Such  a  unified  theory  should 
lead  to  further  restrictive  conditions  on  the  form  of  constitutive  equations 
and  hence  to  more  definite  and  realistic  theories  for  special  materials. 
Also,  a  satisfactory  connection  with  statistical  mechanics  can  be  expected 
only  after  such  a  theory  has  been  developed. 


References 

[1]  NOLL,  W.,  ERICKSEN,  J.  L.  and  TRUESDELL,  C.(  The  Non-linear  Field  Theories 
of  Mechanics.  Article  to  appear  in  the  Encyclopedia  of  Physics. 

[2]  ,  A  general  theory  of  constitutive  equations.  To  appear  in  Archive  for  Ration- 
al Mechanics  and  Analysis. 

[3]  OLDROYD,  J.  G.,  On  the  formulation  of  rheological  equations  of  state.  Proceedings 
of  the  Royal  Society  of  London  (A)  200  (1950),  523-541. 

[4]  NOLL,  W.,  On  the  Continuity  of  the  Solid  and  Fluid  States.  Journal  of  Rational 
Mechanics  and  Analysis  4  (1955),  3-81. 


Symposium  on  the  Axiomatic  Method 


ZUR  AXIOMATISIERUNG  DER  MECHANIK 

HANS  HERMES 

Universitat  Munster,  Munster  in  Westfalen,  Deutschland 

1.  Seit  Newton  1st  die  Mechanik  mehrfach  axiomatisiert  worden.  Es 
sind  Axiome  gegeben  worden  fur  die  Mechanik  der  Massenpunkte  und  fur 
die  Mechanik  der  Kontinua,  fur  die  nicht-relativistische  und  fur  die  rela- 
tivist ische  Mechanik. 

Fur  die  vorliegende  Betrachtung  sollen  diese  Unterschiede  kein  Rolle 
spielen.  Wir  wollen  uns  vielmehr  dafur  interessieren,  welcher  Art  die 
Grundbegriffe  sind,  die  in  den  Axiomensystemen  auftreten.  Die  meisten 
Axiomatisierungen  verwenden  u.a.  kinematische  Grundbegriffe,  wie  die 
Begriffe  des  Ortes,  der  Geschwindigkeit  oder  der  Beschleunigung.  Dabei 
bezieht  man  sich  entweder  auf  ein  festes  System,  oder  man  lasst  eine 
Klasse  von  Bezugssystemen  zu,  wobei  der  Uebergang  zwischen  den  cin- 
zelnen  zugelassenen  Bezugssystemen  vermittelt  wird  durch  Galilei-  bzw. 
Lorentztransf  ormationen . 

Auf  die  Moglichkeit,  kinematische  Grundbegriffe  zu  vermeiden,  indem 
man  sie  mit  Hilfe  von  Definitionen  auf  solche  zuriickfuhrt,  die  epistemolo- 
gisch  vorangehen,  soil  hier  nicht  eingegangen  werden. 

In  den  meisten  Axiomensystemen  (z.B.  bei  McKinsey,  Sugar  und 
Suppes  [3])  findet  man  aber  nicht  nur  rein  kinematische  Grundbegriffe, 
sondern  es  treten  in  ihnen  Grundbegriffe  auf,  wie  die  Begriffe  des  Masse, 
des  Impulses  oder  der  Kraft.  Diese  Begriffe  muss  man  als  typische  dyna- 
mische  oder  eigentlich  mechanische  Begriffe  ansehen.  Es  gibt  aber  auch 
Axiomensystemc  (z.B.  Hermes  [2]),  welche  ausschliesslich  mit  kinema- 
tischen  Grundbegriffen  auskommen. 

Wenn  man  diese  beiden  Moglichkeiten  ins  Auge  fasst,  wird  man  sich 
fragen,  welche  Gesichtspunkte  man  anfuhren  kann,  die  zugunsten  der 
einen  oder  der  anderen  Moglichkeit  sprechen.  Da  muss  zunachst  hervor- 
gehoben  werden,  dass  ein  Axiomensystem,  welches  nicht  nur  kinema- 
tische Grundbegriffe  verwendet,  viel  einfacher  ist,  als  ein  Axiomensystem, 
welches  nur  kinematische  Grundbegriffe  enthalt.  Vom  Standpunkt  der 
formalen  Eleganz  aus  werden  daher  Axiomensysteme  stets  vorzuziehen 
sein,  die  z.B.  den  Massenbegriff  als  Grundbegriff  enthalten.  Ein  anderer 

282 


ZUR  AXIOMATISIERUNG  DER  MECHANIK  283 

Grund,  der  zugunsten  solcher  Axiomensysteme  spricht,  wird  am  Schluss 
dieser  Nummer  genannt. 

Gegen  die  Verwendung  des  Massenbegriffes  und  ahnlicher  Begriffe 
als  Grundbegriffe  in  einem  Axiomensystem  der  Mechanik  spricht  die  fol- 
gende  Ueberlegung :  Fur  einen  Physiker  sind  die  Massen,  Impulse  oder 
Krafte  nicht  unmittelbar  gegeben.  Er  muss  diese  Grossen  vielmehr  durch 
Messungen  bestimmen.  Eine  solche  Messung  besteht  aber  letzten  Endes  in 
einer  Reduktion  auf  kinematische  Begriffe.  Wenn  man  etwa  eine  Masse 
mittels  einer  Federwaage  bestimmt,  macht  man  eine  Ortsmessung;  be- 
stimmt  man  sie  mit  Hilfe  von  Stossgesetzen,  so  misst  man  Geschwindig- 
keiten ;  oder  aber  man  bedient  sich  des  dritten  Newtonschen  Axioms  und 
stellt  Beschleunigungen  fest.  Man  kann  nun  den  Wunsch  haben,  der  Tat- 
sache,  dass  ein  Physiker  auf  solche  Weise  mechanische  Grossen  mit  Hilfe 
kinematischer  Messungen  ermittelt,  in  einer  Axiomatisierung  dadurch 
Rechnung  zu  tragen,  dass  man  den  Begriff  der  Masse  und  verwandte 
Begriffe  durch  Definitionen  auf  kinematische  Begriffe  zuruckfuhrt,  die 
den  physikalischcn  Messmoglichkeiten  entsprechen. 

Bei  der  Bestimmung  mechanischer  Grossen  durch  kinematische  Mes- 
sungen muss  man  die  Giiltigkeit  des  einen  oder  des  anderen  physikalischen 
Gesetzes  voraussetzen.  Bei  der  Bestimmung  der  Masse  z.B.  kann  dies  das 
Stossgesetz  oder  das  Gravitationsgesetz  sein  (vgl.  [1]).  Eine  entsprechen- 
de  Definition  der  Masse  muss  sich  der  jeweiligen  physikalischen  Hypo- 
these  bedienen.  Je  nachdem,  welche  Hypothese  man  in  die  Definition  der 
Masse  hineinsteckt,  kommt  man  zu  verschiedenen  und  primar  unver- 
gleichbaren  Theorien  der  Mechanik.  Man  sollte  dies  klar  zum  Ausdruck 
bringen  und  deutlich  verschiedene  Mechaniken  unterscheiden,  genau  so, 
wie  man  sich  seit  langerem  daran  gewohnt  hat,  von  verschiedenen  Geo- 
metrien  zu  sprechen. 

Jede  solche  Mechanik  ist  natiirlich  eine  Idealisierung.  Sie  ist  es  ins- 
besondere  in  folgender  Hinsicht :  Ein  Physiker  wird  sich  bei  seinen  Mes- 
sungen natiirlich  nicht  darauf  beschranken,  z.B.  die  Masse  ausschliesslich 
mit  Hilfe  eines  einzigen  physikalischen  Gesetzes  zu  bestimmen;  er  wird 
sich  vielmehr  vorbehalten,  je  nach  den  Umstanden  das  geeignetste  Gesetz 
zu  wahlen.  Eine  Axiomatisierung  der  Mechanik,  welche  die  Masse  auf 
Grund  einer  einzigen  physikalischen  Hypothese  definiert,  bevorzugt 
dieses  Gesetz  in  einem  besonderen  Masse.  Man  wird  sich  umso  eher  mit 
einer  solchen  Bevorzugung  befreunden  konnen,  je  grundlegender  das 
Gesetz  ist,  auf  welches  dabei  zuriickgegriffen  wird. 

Die  Tatsache,  dass  man  nicht  ohne  weiteres  geneigt  sein  wird,  bei  einer 


284  HANS   HERMES 

Definition  z.B.  der  Masse  ein  bestimmtes  physikalisches  Gesetz  zu  be- 
vorzugen,  mag  dazu  beigetragen  haben,  dass  viele  Autoren  es  vorziehen, 
die  Masse  und  andere  nicht-kinematische  Begriffe  bei  einer  Axiomatisie- 
rung  der  Mechanik  als  Grundbegriffe  zu  verwenden. 

2.  Im  folgenden  soil  berichtet  werden  iiber  den  in  [2]  unternommenen 
Versuch,  die  Mechanik  zu  axiomatisieren  unter  Verwendung  rein  kinema- 
tischer  Grundbegriffe.  Dabei  wird  zur  Definition  der  Masse  zuriickgegrif- 
fen  auf  das  grundlegende  Gesetz  der  Erhaltung  des  Impulses  bei  un- 
elastischen  Zusammenstossen.  Gleichzeitig  sollen  hier  einige  Unvoll- 
kommenheiten  beseitigt  werden,  auf  welche  B.  Rosser  in  seinem  Referat 
[4]  hingewiesen  hat  (vgl.  auch  die  Korrekturen  im  Anhang).  In  der  ge- 
nanntcn  Abhandlung  [2]  ist  die  relativistiche  Kontinuumsmechanik  auf- 
gebaut  worden.  Da  es  im  folgenden  nur  auf  die  Grundgedanken  ankommt, 
soil  hier  die  nicht-relativistische  Punktmechanik  axiomatisiert  werden, 
was  im  Einzelnen  weseritlich  einfacher  ist. 

Zunachst  einige  Vorbemerkungen  zur  Symbolisierung.  Bei  der  Axioma- 
tisierung  wird  eine  Theorie  der  rcllen  Zahlen  vorausgesetzt.  Die  eigent- 
lichen  mechanischen  Aussagen  werden  in  einer  Stufenlogik  wiedergegcben, 
wobei  auf  der  untersten  Stufe  zwei  Sorten  von  Individuenvariablen  ver- 
wendet  werden.  Die  Individuenvariablen  TI,  r<2,  . . .  beziehen  sich  auf 
reelle  Zahlen,  die  Individuenvariablen  x,  y,  ...  auf  momentanc  Masscn- 
punkte.  Ein  momentaner  Massenpunkt  ist  ein  zu  einem  bestimmten  Zeit- 
punkt  betrachteter  Massenpunkt,  also  ein  zeitlicher  Schnitt  durch  die 
Weltlinie  eines  Massenpunktes.  Man  kann  auf  die  explizite  Verwendung 
des  Begriffs  eines  Massenpunktes  verzichten,  wenn  man  einen  Massen- 
punkt auffasst  als  eine  grosste  Klasse  ,,zusammengehoriger"  momen- 
taner Massenpunkte.  Die  Zusammengehorigkeit  momentaner  Massen- 
punkte  bedeutet  ihre  Zugehorigkeit  zu  einem  und  demselben  Massen- 
punkt. Zusammengehorige  momentane  Massenpunkte  sollen  nach  Levin 
%enidentisch  genannt  werden. 

Orte  und  Zeiten  momentaner  Massenpunkte  werden  durch  Bezugs- 
systeme  festgelegt.  Ein  Bezugssystem  H,  wie  es  in  der  Mechanik  verwendet 
wird,  kann  aufgefasst  werden  als  eine  funfstellige  Relation  zwischen  reel- 
len  Zahlen  TI,  T2,  TS,  T4  und  momentanen  Massenpunkten  x.  Lrir^ryr^c 
besagt,  dass  %  im  Bezugssystem  E  die  Raumkoordinaten  TI,  T2,  TS  und  die 
Zeitkoordinate  T4  besitzt.  Haufig  wird  fur  Zrir^r^r^x  die  abkiirzende  Be- 
zeichnung  £ITX  verwendet. 

Das  Axiomensystem  enthalt  zwei  Grundbegriffe,  namlich  die  zwei- 


ZUR  AXIOMATISIERUNG  DER  MECHANIK  285 

stcllige  Relation  G  und  die  Klasse  83.  Gxy  soil  bedeuten,  dass  die  momen- 
tanen  Massenpunkte  x  und  y  genidentisch  sind,  d.h.,  dass  sie  zu  dem- 
selben  Massenpunkt  gehoren.  9327  besage,  dass  E  ein  galileisches  Bezugs- 
system  (Inertialsystem)  1st. 

3.  Nach  diesen  Vorbereitungen  sollen  die  Axiome  formuliert  werden. 
Der  Einfachheit  halber  werden  die  Axiome  angegeben  unter  Verwendung 
der  Konvention,  dass  frei  vorkommende  Variablen  r\,  TZ,  .  .  .,  x,  y,  .  .  .  , 
Z,  ...  generalisiert  gedacht  werden. 

Das  erste  Axiom  sagt  aus,  dass  G  eine  Aequivalenzrelation  ist  : 

AXIOM  1.1.     Gxx 
AXIOM  1.2.     Gxy  ->Gyx 
AXIOM  1  .3.     Gxy  A  Gyz  ->  Gxz 

Axiom  2.1  bring!  zum  Ausdruck,  dass  die  Koordinaten  eincs  momen- 
tanen  Massenpunkt  es  x  in  jedem  Bezugssystem  Z  cindeutig  festgelegt 
sind.  Axiom  2.2  besagt,  dass  genidentische  momentane  Massenpunkte 
x,  y  identisch  sind,  wenn  sie  in  einem  Bezugssystem  £  diesclbe  Zeit- 
koordinate  bcsitzen.  In  Axiom  2.3  wird  gefordert,  dass  ein  Massenpunkt 
eine  ,,unendliche  Lebensdauer"  besitzt. 


AXIOM  2.1.       $$Z  A  Zt\r\X  A  Zt2T2#  ->  tl  —  t2  A  TI  —  T2 

AXIOM  2.2.     93£  A  G%\%^  A  ZVITXI  A  Zi^rx^  ->  x\  =  x% 
AXIOM  2.3.     33r  ->  VV  (Gxy  A  27try) 


r  ?/ 


Der  Zusammenhang  zwischcn  den  verschiedenen  Koordinatensystcmen 
wird  hergestcllt  mittels  der  sog.  Galileitransformationen.  gal  F  bedeute, 
dass  F  eine  Galileitransformation  ist.  Der  Begriff  der  Galileitransforma- 
tion  ist  bekannt.  Es  handelt  sich  bei  jcder  solchen  Transformation  um 
einen  speziellen  Automorphismus  des  gesamten  reellen  vierdimensionalen 
Raumes.  Wenn  man  die  Koordinaten  aller  momentancn  materiellcn 
Punkte  in  einem  Inertialsystem  2  einer  derartigen  Galileitransformation 
r  unterwirft,  so  erhiilt  man  neue  Koordinaten.  Dass  diese  Zuordnung 
wieder  ein  Inertialsystem  ist,  wird  in  Axiom  3.1  gefordert.  Dass  je  zwei 
Inertialsysteme  in  dieser  Weise  zusammenhangen,  wird  in  Axiom  3.2 
ausgesagt.  Hierbci  muss  man  bcachten  (vgl.  hierzu  die  Rossersche  Kritik 
an  der  urspriinglichen  Formulierung  des  entsprechenden  Axioms  A.  4.  5 


286  HANS   HERMES 

in  [2]),  dass  nicht  angenommen  wird,  dass  der  gesamte  dreidimensionale 
Raum  mit  Massenpunkten  besetzt  1st.  Zwei  Inertialsysteme  Z\  und  Z% 
liefern  daher  Transformationen  nur  fur  solche  Quadrupel  reeller  Zahlen. 
zu  denen  es  momentane  Massenpunkte  gibt,  welche  in  Z\  diese  Quadrupel 
als  Koordinaten  haben.  Man  darf  daher  nur  verlangen,  dass  die  auf  diese 
Weise  gewonnene  Koordinatentransformation  in  einer  Galileitransfor- 
mation  enthalten  ist. 

In  den  beiden  folgenden  Axiomen  tret  en  zwei  ,,Verkettungen"  auf, 
welche  zunachst  erklart  werden  solien: 

DEFINITION  (rjS)irx  bedeute  VV(/YrtV  A  Zt'r'x) 

r'  r' 


DEFINITION  (Xiz^vrfr    bedeute 

Damit  formulieren  wir 

AXIOM  3.  1  .     ftZ  A  gair  ->  »(r/27) 

AXIOM  3.2.     93Zi  A  »272  ->  V  (galF  A 

r 

Wir  wollen  die  Massen  aus  Geschwindigkeiten  bei  Stossversuchen  ab- 
lesen.  Legen  wir  ein  Inertialsystem  Z  zugrunde,  so  ist  die  Geschwindig- 
keit  eines  momentanen  Massenpunktes  XQ  gegeben  als  lim  (r—  to)/(r—  TO)  ; 

T-*TO 

dabci  sei  ZTCQTQXQ  und  es  gelte  Z\rx  fiir  denjenigen  momentanen  Massen- 
punkt  x,  der  mit  XQ  genidentisch  ist,  und  dem  in  27  die  Zeitkordinate  r 
zukommt.  Die  Existenz  und  Eindeutigkeit  von  x  ergibt  sich  aus  den 
Axiomen  2.3  und  2.2.  Wir  wollen  im  folgenden  unelastische  Stosse  be- 
trachten.  Es  ist  am  einfachsten  anzunehmen,  dass  solche  Stosse  momen- 
tan  erfolgen  und  dass  sich  dabei  die  Geschwindigkeiten  der  beteiligten 
Massenpunkte  unstetig  andern.  Wir  werden  daher  von  Geschwindigkeiten 
unmittelbar  vor  dem  Stosse  oder  kurz  Vorgeschwindigkeiten  und  ent- 
sprechend  von  Nachgeschwindigkeiten  reden.  Vel-  Zbrx  soil  besagen, 
dass  im  Inertialsystem  £  der  momentane  Massenpunkt  x  zur  Zeit  T  die 
Vorgeschwindigkeit  t)  besitzt.  Analog  sei  Vel+  Zbrx  eingefiihrt.  Wir 
wollen  hier  der  Kiirze  halber  auf  die  explizite  Definition  von  Vel-  und  Vel+ 
verzichten  und  das  nachste  Axiom  nur  umgangssprachlich  formulieren: 

AXIOM  4.  Die  Massenpunkte  besitzen  eine  stiickweise  stetige  Geschwin- 
digkeit  ;  an  den  Sprungstellen  existieren  wenigstens  die  Grenzwerte  von  links 
bzw.  von  rechts  (Vor-  bzw.  Nachgesc  hwindigkeit)  . 


ZUR  AXIOMATISIERUNG  DER  MECHANIK  287 

4.  Bei  einem  Stoss  treffen  sich  zwei  Massenpunkte  an  derselben 
Stelle.  Dies  soil  hier  (im  Gegensatz  zu  der  zitierten  Abhandlung)  aus- 
driicklich  zugelassen  werden,  um  die  Stossgesetze  so  einfach  wie  moglich 
darsetllen  zu  konnen.  Es  muss  jedoch  bemerkt  werden,  dass  damit  das 
Prinzip  der  Undurchdringlichkeit  der  Materie  geopfert  wird.  (In  der  zi- 
tierten Abhandlung  [2J  wird  keine  derartige  Annahme  gemacht.) 

Ein  unelastischer  Stoss  ist  dadurch  gekennzeichnet,  dass  die  beiden 
beteiligten  Massenpunkte  unmittelbar  nach  dem  Stoss  dieselbe  Geschwin- 
digkeit  haben.  Diese  Geschwindigkeit  kann  bei  Wahl  eines  geeigneten 
Bezugssystems  27  als  o  angenommen  werden.  Es  werde  gefordert,  dass  die 
Geschwindigkeiten  unmittelbar  vor  dem  Stoss  von  o  verschieden  sind. 
Schliesslich  muss  noch  verlangt  werden,  dass  nur  die  beiden  betrachteten 
Massenpunkte  am  Stoss  beteiligt  sind,  d.h.,  dass  sich  zur  Zeit  des  Stosses 
kein  dritter  Massenpunkt  am  Stossort  befindet.  Stoss  ZrXiXz  soil  bedeu- 
ten,  dass  die  zu  den  momentanen  Massenpunkt  en  x\,  x%  gehorenden 
Massenpunkte  zu  der  in  27  gemessenen  Zeit  r  einen  Stoss  erleiden,  bei 
welchem  sie  (in  27  gemcssen)  zur  Ruhe  kommen. 

DEFINITION  :     Stoss  ZrxiXz  =Df  9327  A  V(27tr#i  A  27 


r 


A  Vel+  27or#i  A  Vel+ 

A  V  V  (Vel-  27&iT*i  A  Vel-  27t>2T*2  A  t>i  ^  o  A  b2  ^  o) 


A  A  A(27rry  A  27tr#i  ->  y  =  x\  v  y  = 


v  r 


Fur  den  unelastischen  Stoss  gilt  das  Gesetz  der  Impulserhaltung. 
Sind  mi  bzw.  m<z  die  Massen  der  beteiligten  Massenpunkte  und  t)i  bzw.  &2 
die  in  27  gemessenen  Geschwindigkeiten  vor  dem  Stoss,  so  hat  man 
wit)i  +  W2&2  =  0.  Damit  ergibt  sich  die  Moglichkeit,  dass  Massen  ver- 
haltnis  a  aus  dem  Verhaltnis  der  Geschwindigkeitsbetrage  zu  ermitteln. 
Masse  OLXXQ  soil  inhaltlich  besagen,  dass  die  Masse  von  x  a-mal  so  gross  ist 
wie  die  Masse  des  Vergleichsmassenpunktes  XQ. 

Sind  x  und  XQ  genidentisch,  so  wird  ein  Stossversuch  illusorisch.  In 
diesem  Fall  soil  das  Massenverhaltnis  per  definitionem  gleich  eins  gesetzt 
werden.  Wir  kommen  damit  zu  der  grundlegenden 

DEFINITION  :     Masse  OCXXQ  =DfVVVVVV  (Gxy  A  Gxoyo  A  Stoss  27ryyo 

2    T     II  l/o    tJ    t)o 

A  Vel-  27t)ry  A  Vel-  27t>oryo  A  a-|t)|  =  |t)o|)  v  (Gxxg  A  a  =  1) 


288  HANS    HERMES 

Im  folgenden  sollen  weitere  Axiome  formuliert  werden,  welche  sich 
auf  die  Existenz  und  die  Eindeutigkeit  des  Massenverhaltnisses  beziehen. 
Da  es  hier  nur  auf  eine  prinzipielle  Diskussion  ankommt,  soil  kein  Wert 
clarauf  gelegt  werden,  die  Axiome  so  schwach  wie  moglich  zu  formulieren. 
Fur  eine  vollstandigen  Aufbau  der  Mechanik  ist  es  natiirlich  erforderlich, 
iiber  die  genannten  Axiome  hinaus  noch  weitere  zu  fordern,  welche  sich 
z.B.  auf  die  Giiltigkeit  des  Impulssatzes  beziehen.  Ausserdem  miisste  u.a. 
der  Begriff  der  Kraft  eingefiihrt  werden.  Hierzu  soil  auf  [2]  verwiesen 
werden. 

Zunachst  formulieren  wir  ein  Axiom,  welches  zum  Ausdruck  bringt, 
dass  es  sich  bei  der  soeben  eingefuhrten  Masse  um  ein  wirkliches  Ver- 
haltnis  handelt. 

AXIOM  5.     Masse  ayz  A  Masse  fax  A  Masse  yxy  ->  <xfty  =  1 

Wir  formulieren  nun  cinige  einfache  Satze.  Satz  5  sagt  aus,  dass  das 
Massenverhaltnis  eindeutig  ist,  d.h.  nur  von  den  beteiligten  Massen- 
punkten  abhangt. 

SATZ  1 :     Stoss  Zrx\X2  ->  Stoss  ZTX^XI 
SATZ  2 :     Masse  OLXXQ  ->  a  7^  0 

SATZ  3 :     Masse  OLXXQ  ->  Masse  —  XQX 

a 

SATZ  4 :     Masse  OLXXQ  A  Gxy  A  GXQJQ  ->  Masse  ayyo 

SATZ  5 :     Masse  OLXXQ  A  Masse  fiyyv  A  Gxy  A  Gyoyo  ->  a  =  /5 

BKWKIS  :  Satz  1  folgt  unmittelbar  aus  der  Definition  des  Stosses.  Satz 
2  ergibt  sich  daraus,  dass  nach  der  Definition  des  Stosses  die  in  Frage 
kommendcn  Vorgeschwindigkeiten  von  o  verschieden  sind.  Satz  3  ergibt 
sich  aus  Satz  1  und  Satz  2.  Satz  4  folgt  aus  der  Massendefinition.  Satz  5 
zeigt  man  so :  Zunachst  hat  man  Masse  ayyo  nach  Satz  4  und  damit  Masse 

—  yoy  nach  Satz  3.  Wegcn  Gyy  gilt  Masse  \yy.  Nun  hat  man  Masse 

1  1 

pyyo  A  Masse  -  y$y  A  Masse  lyy,also  ft-    —  •  1  =  1  nach  Axiom  5. 
a  a 

Das  letzte  Axiom,  welches  hier  diskutiert  werden  soil,  betrifft  die 


ZUR  AXIOMATISIERUNG  DER  MECHANIK  289 

Existenz  das  Massenverhaltnisses: 
AXIOM  6.     V  Masse  OLXXQ 

a 

Dieses  Axiom  bring!  zum  Ausdruck,  dass  je  zwei  verschiedene  Massen- 
punkte  mindestens  einmal  unelastisch  zusammenstossen.  Das  ist  eine 
sehr  starke  Forderung.  (Man  konnte  diese  Forderung  abschwachen,  indem 
man  nur  verl angle,  dass  es  zu  je  zwei  verschiedenen  Massenpunkten  eine 
endliche  Kette  von  Massenpunkten  gibt,  von  denen  je  zwei  aufeinander- 
folgende  Massenpunkte  irgendwann  unelastisch  zusammenstossen,  und 
wenn  man  die  Definition  des  Massenverhaltnisses  entsprechend  modifi- 
zierte.  Aber  auch  ein  derartig  modifiziertes  Axiom  wiirde  eine  starke 
Forderung  aussprechen.)  Zu  diesem  Axiom  (bzw.  zu  dem  analogen  Axiom 
8.1  in  [2])  sagt  Rosser  in  [4],  dass  verlangt  wird,  dass  die  Massenpunkte 
"behave  in  certain  very  pecular  fashions".  Axiom  6  mag  jedoch  weniger 
befremdlich  erscheinen,  wenn  man  sich  vergegenwartigt,  dass  es  entstan- 
den  ist  als  eine  Formulierung  der  idealisierten  Vorstellung,  dass  Physiker 
Massen  durch  Stossversuche  bestimmen  konnen. 

Zugunsten  der  angegebenen  Formulierung  mag  auch  noch  ein  analoger 
Sachverhalt  aus  der  Geometric  herangezogen  werden.  Ein  geometrisches 
Axiom  sagt  aus,  dass  es  zu  jc  zwei  voneinander  verschiedenen  Punkten 
eine  Gerade  gibt,  welche  diese  beiden  Punkte  verbindet.  Die  geometri- 
schen  Axiome  geben  wie  die  Axiome  der  Mechanik  ursprunglich  physika- 
lische  Sachverhalte  wieder.  Denkt  man  an  eine  Realisierung  der  Geraden 
etwa  durch  gespannte  Seile,  so  soil  durch  das  genannte  Axiom  zum  Aus- 
druck gebracht  werden,  dass  cs  zwischen  je  zwei  Raumpunktcn  eine  der- 
artige  Seilverbindung  gibt.  Dies  ist  vollig  analog  zu  der  Forderung,  dass 
je  zwei  Massenpunkte  im  Laufe  der  Zeit  einen  unelastischcn  Zusammen- 
stoss  erleiden. 

Das  soeben  betrachtete  gcometrische  Beispiel  gibt  aber  auch  einen 
Hinweis  darauf,  wie  man  zu  einer  plausiblen  Abschwachung  der  betrach- 
teten  Axiome  kommen  kann.  Man  konnte  namlich  sagen,  dass  die  Mog- 
lichkeit  besteht,  zwei  beliebige  Raumpunkte  durch  ein  gespanntes  Seil  zu 
verbinden.  Damit  ware  das  Axiom  streng  genommen  nur  eine  Moglich- 
keitsaussage.  Entsprechend  konnte  man  das  mechanische  Axiom  6  so 
abschwachen,  dass  man  nur  verlangt,  dass  es  moglich  ist,  dass  je  zwei 
Massenpunkte  irgendwann  unelastisch  zusammenstossen. 

Eine  andere  Methode,  einer  so  starken  Formulierung,  wie  sie  Axiom  6 
darstellt,  zu  entgehen,  besteht  darin,  den  Massenbegriff  durch  eine  Re- 
duktion  (nach  Carnap)  auf  kinematische  Begriffe  zuruckzufuhren. 


290  HANS   HERMES 

ANHANG:  Corrigenda  zu  [2]: 

S.  10,  Z.  12  v.u.  lies:  ,,((0000)*)".  S.  10,  Z.  5.  v.u.  lies:  „/(*) 
S.  13  fiige  ein:  ,,A4.3':  ZZeBzs".  S.  13,  A4.5  lies:  „/(/  elo  A 
S.  28,  Z.  8  v.o.  statt:  ^vergleichbaren"  lies:  ,,genidentischen" .  S.  31,  Z.  6 
v.o.  statt  ,,/s0/"  lies  ,,/s". 


Bibliographic 

[1]    ADAMS,  E.  W.,  The  foundations  of  rigid  body  mechanics  and  the  derivation  of  its 

law  from  those  of  particle  mechanics.  This  volume,  pp.  250-265. 
[2]     HERMES,   II.,  Eine  Axiomatisierung  der  allgemeinen  Mechanik.  Forschungcn 

zur  Logik  untl  zur  (Trundlcgung  der  cxakten  Wissenschaftcn.  Ncuc  Folge,  Heft 

3,  Leipzig  1938,  48  S. 
[3]    McKiNSKY,  J.  C.  C\,  SUGAR,  A.  C.  and  SUPPES,  P.,  Axiomatic  Foundations  of 

Classical  Particle  Mechanics.  Journal  of  Rational  Mechanis  and  Analysis,  vol. 

2  (1953),  pp.  253-272. 
[4]     ROSSER,  B.,  Review  of  HERMES  [1J.  Journal  of  Symbolic  Logic,  vol.  3  (1938), 

pp.  119-120. 


Symposium  on  the  Axiomatic  Method 


AXIOMS  FOR  RELATIVISTIC  KINEMATICS 
WITH  OR  WITHOUT  PARITY 

PATRICK  SUPPES 

Stanford  University,  Stanford,  California,  U.S.A. 

1 .  Introduction.  The  primary  aim  of  this  paper  is  to  give  an  elementary 
derivation  of  the  Lorentz  transformations,  without  any  assumptions  of 
continuity  or  linearity,  from  a  single  axiom  concerning  invariance  of  the 
relativistic  distance  between  any  two  space-time  points  connected  by  an 
inertial  path.  The  concluding  section  considers  extensions  of  the  theory  of 
relativistic  kinematics  which  will  destroy  conservation  of  temporal  parity, 
that  is,  extensions  which  are  not  invariant  under  time  reversals. 

It  is  philosophically  and  empirically  interesting  that  the  Lorentz 
transformations  can  be  derived  without  any  extraneous  assumptions  of 
continuity  or  differentiability.  In  a  word,  the  single  assumption  needed 
for  relativistic  kinematics  is  that  all  observers  at  rest  in  inertial  frames 
get  identical  measurements  of  relativistic  distances  along  inertial  paths 
when  their  measuring  instruments  have  identical  calibrations.  Note  that 
it  is  a  consequence  and  not  an  assumption  that  these  observers  arc  moving 
with  a  uniform  velocity  with  respect  to  each  other.  Granted  the  possi- 
bility of  perfect  measurements  everywhere  of  relativistic  intervals,  this 
single  axiom  isolates  in  a  precise  way  the  narrow  operational  basis  needed 
for  the  special  theory  of  relativity. 

Prior  to  any  search  of  the  literature  it  would  seem  that  this  result 
would  be  well-known,  but  I  have  not  succeeded  in  finding  the  proof 
anywhere.  Every  physics  textbook  on  relativity  makes  a  linearity 
assumption  at  the  minimum.  In  geometrical  discussions  of  indefinite 
quadratic  forms  it  is  often  remarked  that  the  relativistic  interval  is 
invariant  under  the  Lorentz  group,  but  it  is  not  proved  that  it  is  invariant 
under  no  wider  group,  which  is  the  main  fact  established  here.  Some 
further  remarks  in  this  connection  are  made  at  the  end  of  Section  2. 

2.  Primitive  Notions  and  Single  Axiom.  Our  single  initial  axiom  for 
relativistic  kinematics  is  based  on  three  primitive  notions,  each  of  which 
has  a  simple  physical  interpretation.  The  first  notion  is  an  arbitrary  set  X 

291 


292  PATRICK   SUPPES 

interpreted  as  the  set  of  physical  space-time  points.  The  second  notion  is  a 
non-empty  family  $  of  one-one  functions  mapping  X  onto  ^4,  the  set  of 
all  ordered  quadruples  of  real  numbers.  (Thus  X  must  have  the  power  of 
the  continuum.)  Intuitively  each  function  in  ^  represents  an  inertial 
space-time  frame  of  reference,  or,  more  explicitly,  a  space-time  measuring 
apparatus  at  rest  in  an  inertial  frame.  If  x  e  X,  f  e  g,  and  f(x)  = 
<xi,  x%,  #3,  />  then  x\,  x<&,  and  #3  are  the  three  orthogonal  spatial  co- 
ordinates of  the  point  x,  and  t  the  time  coordinate,  with  respect  to  the 
frame  /.  For  a  more  explicit  formal  notation,  ft  (x)  is  the  zth  coordinate  of 
the  space- time  point  x  with  respect  to  the  frame  /,  f  or  i  =  1 ,  . . . ,  4.  The 
third  primitive  notion  is  a  positive  number  c,  which  is  to  be  interpreted 
as  the  speed  of  light. 

It  is  convenient  to  have  a  notation  for  the  relativistic  distance  with 
respect  to  a  frame  /  between  any  two  space-time  points  x  and  y. 

DEFINITION  1 .     //  x,  y  E  X  and  f  e  g  then 
If(xy)  = 

1     »--=! 

(We  always  take  the  square-root  with  positive  sign.)  If  /  is  an  inertial 
frame,  then  (i)  If(xy)  =  0  if  x  and  y  are  connected  by  a  light  line,  (ii) 
I'j(xy)  <  0  if  x  and  y  lie  on  an  inertial  path  (the  square  is  negative  since 
If(xy)  is  imaginary) ;  (iii)  I(xy)  >  0  if  x  and  y  are  separated  by  a  "space- 
like"  interval.  We  use  (ii)  for  a  formal  definition. 

DEFINITION  2.  //  x,  y  e  X  and  f  eft  then  x  AND  y  LIE  ON  AN  INERTIAL 
PATH  WITH  RESPECT  TO  /  if  and  only  if  Ij(xy)  <  0. 

It  will  also  occasionally  be  useful  to  characterize  inertial  paths  in  terms 
of  their  speed.  We  may  do  this  informally  as  follows.  By  the  slope  of  a  line 
a  in  7?4,  whose  projection  on  the  4th  coordinate  (the  time  coordinate)  is  a 
non-degenerate  segment,  we  mean  the  three-dimensional  vector  W  such 
that  for  any  two  distinct  points  <Zi,  t\y  and  <Z2,  t%>  of  a 


By  the  speed  of  a  we  mean  the  non-negative  number  |  W\.  An  inertial  path 


AXIOMS   FOR   RELATIVISTIC    KINEMATICS  293 

is  a  line  in  R*  whose  speed  is  less  than  c ;  and  a  light  line  is  of  course  a  line 
whose  speed  is  c. 

The  single  axiom  we  require  is  embodied  in  the  following  definition. 

DEFINITION  3.  A  system  Hi  =  (X,  Qf,  c>  is  a  COLLECTION  OF  RELA- 
TIVISTIC FRAMES  if  and  only  if  for  every  x,  y  in  X,  whenever  x  and  y  lie  on  an 
inertial  path  with  respect  to  some  frame  in  $,  then  for  all  /,  f  in  $ 

(1)  If(Xy)  =  If.(xy). 

I  orginally  formulated  this  invariance  axiom  so  as  to  require  that  equation 
(1)  hold  for  all  space-time  points  x  and  y  ,that  is,  without  restricting 
them  to  lie  on  an  inertial  path  (with  respect  to  some  frame  in  g).  Walter 
Noll  pointed  out  to  me  that  with  this  stronger  axiom  no  physically 
motivated  arguments  of  the  kind  given  below  are  required  to  prove  that 
any  two  frames  in  $  are  related  by  a  linear  transformation ;  a  relatively 
simple  algebraic  argument  may  be  given  to  show  this. 

On  the  other  hand,  when  the  invariance  assumption  is  restricted,  as 
it  is  here,  to  distances  between  points  on  inertial  paths,  the  line  of  argu- 
ment formalized  in  the  theorems  of  the  next  section  seems  necessary.  This 
restriction  to  pairs  of  points  on  inertial  paths  is  physically  natural 
because  their  distances  I/(xy)  are  more  susceptible  to  direct  measurements 
than  are  the  distances  of  points  separated  by  a  space-like  interval  (i.e., 
If(xy)  >  0). 

3.  Theorems.  In  proving  the  main  result  that  any  two  frames  in  $ 
are  related  by  a  Lorentz  transformation,  some  preliminary  definitions, 
theorems  and  lemmas  will  be  useful.  We  shall  use  freely  the  geometrical 
language  appropriate  to  Euclidean  four-dimensional  space  with  the 
ordinary  positive  definite  quadratic  form. 

THEOREM  1.  If  k^O  and  f(x)  -  f(y)  =  k[f(u)  -  f(v)]  then  If(xy)  = 
klf(uv). 

PROOF:  If  k  —  0,  the  theorem  is  immediate.  So  we  need  to  consider 
the  case  for  which  k  >  0.  It  follows  from  the  hypothesis  of  the  theorem 
that 

(1)  Xi  —  yt  =  k(ui  —  Vi)  for  i  =  1,  . . . ,  4, 

where,  for  brevity  here  and  subsequently,  when  we  are  considering  a 
fixed  element  /  of  $,  ft(x)  =  xt,  etc.  Using  (1)  and  Definition  1  we  then 


294  PATRICK   SUPPES 

have: 


2  ( 

-l 


i      1 

-  /e//(«v).  Q.E.D. 

In  the  next  theorem  we  use  the  notion  of  bctweenness  in  a  way  which  is 
meant  not  to  exclude  identity  with  one  of  the  end  points. 

THEOREM  2.  //  the  points  /(#),  f(y)  and  f(z)  are  collinear  and  f(y)  is 
between  f(x)  and  f(z)  then 

If(xy)  +  If(yz)  =  If(xz). 

PROOF:  Extending  our  subscript  notation,  let  f(x)  =  x,  etc.  Since  the 
three  points  xt  y  and  z  are  collinear,  and  y  is  between  x  and  z,  there  is  a 
number  k  such  that  0  <  k  <  1  and 

(1)  y  =  **  +  (1  -  k)z, 
whence 

y  —  z  =  k(x  —  Z), 

and  thus  by  Theorem  1 

(2)  If(yz)  =  k!f(Xz). 

By  adding  and  subtracting  x  from  the  right-hand  side  of  (1),  we  get: 

3;  =  kx  +  (1  —  k)z  +  x  —  x, 
whence 

x-y  =  (\-k)(x-z), 

and  thus  by  virtue  of  Theorem  1  again, 

(3)  If(xy)  =  (1  -  k)If(xz). 
Adding  (2)  and  (3)  we  obtain  the  desired  result: 

If(*y)  +  h(yz)  =  //M-  Q-E.D. 

Our  next  objective  is  to  prove  a  partial  converse  of  Theorem  2.  Since 
the  notion  of  Lorentz  transformation  is  needed  in  the  proof,  we  introduce 


AXIOMS    FOR   RELATIVISTIC    KINEMATICS  295 

the  appropriate  formal  definitions  at  this  point.  ,/  is  the  identity  matrix 
of  the  necessary  order. 

DEFINITION  4.  A  matrix  s/  (of  order  4)  is  a  LORENTZ  MATRIX  if  and 
only  if  there  exist  real  numbers  ft,  6,  a  three-dimensional  vector  U,  and  an 
orthogonal  matrix  $  of  order  3  such  that 


52=    1 


J/  =         )  U2 

\0      d/    \      -0C7  /? 

(In  this  definition  and  elsewhere,  if  A  is  a  matrix,  /I*  is  its  transpose,  and 
vectors  like  U  are  one-rowed  matrices  —  thus  U*  is  a  one-column  ma- 
trix.) The  physical  interpretation  of  the  various  quantities  in  Definition  1 
should  be  obvious.  The  number  ft  is  the  Lorentz  contraction  factor.  When 
6  =  —  1 ,  we  have  a  reversal  of  the  direction  of  time.  The  matrix  <f 
represents  a  rotation  of  the  spatial  coordinates,  or  a  rotation  followed  by  a 
reflection.  The  vector  U  is  the  relative  velocity  of  the  two  frames  of  re- 
ference. For  future  reference  it  may  be  noted  that  every  Lorentz  matrix  is 
non-singular. 

DEFINITION  5.  A  Lorentz  transformation  is  a  one-one  function  <p 
mapping  R$  onto  itself  such  that  there  is  a  Lorentz  matrix  J/  and  a  4- 
dimensional  vector  B  so  that  for  all  Z  in  R^ 

y(Z)  =  Zj3/  +  B. 

The  physical  interpretation  of  the  vector  B  is  clear.  Its  first  three  co- 
ordinates represent  a  translation  of  the  origin  of  the  spatial  coordinates, 
and  its  last  coordinate  a  translation  of  the  time  origin.  Definition  5  makes 
it  clear  that  every  Lorentz  transformation  is  a  nonsingular  affine  transfor- 
mation of  R$,  a  fact  which  we  shall  use  in  several  contexts.  The  important 
consideration  for  the  proof  of  Theorem  3  is  that  affine  transformations 
preserve  the  collinearity  of  points. 

THEOREM  3.  //  any  two  of  the  three  points  x,  y,  z  are  distinct  and  lie 
on  an  inertial  path  with  respect  to  f  and  if  If(xy)  +  If(yz)  =  If(xz),  then  the 
points  f(x),  f(y)  and  f(z)  are  collinear,  and  f(y)  is  between  f(x)  and  f(z). 

PROOF:    Three  cases  naturally  arise. 


296  PATRICK   SUPPES 

Case  1.  I2(xy)  <  0.  In  this  case  the  line  segment  f(x)  —  f(y)  is  an 
inertial  path  segment  from  x  to  y,  and  there  exists  a  Lorentz  transfor- 
mation <p  which  will  transform  the  segment  f(x)  —  f(y)  to  "rest",  that  is, 
more  precisely,  cp  may  be  chosen  so  as  to  transform  /  to  a  frame  /',  which 
need  not  be  a  member  of  §,  such  that  the  spatial  coordinates  of  x  and  y  are 
at  the  origin,  the  time  coordinate  of  x  is  zero,  and  z  has  but  one  spatial 
coordinate,  by  appropriate  spatial  rotation.  That  is,  we  have  : 

/'(*)  =  <0,  0,  0,  0>, 

f'(y)  =  <o,  o,  o,  y;>, 


We  shall  prove  that  /'(#)»  f'(y)  and  /'(z)  arc  collinear.  Since  <p  is  non- 
singular  and  affinc,  its  inverse  y~l  exists  and  is  affine,  whence  collinearity 
is  preserved  in  transforming  from  /'  back  to  /. 

It  is  a  familiar  fact  that  the  relativistic  intervals  If(xy),  If(yz)  and 
I/(xz)  are  Lorentz  invariant  and  thus  have  the  same  value  with  respect  to 
/'  as  /.  Consequently,  from  the  additive  hypothesis  of  the  theorem,  we 
have: 


Squaring  both  sides  of  (1),  then  cancelling  and  rearranging  terms,  we 
obtain : 


(2)  V-  y'*  •  Vz'S  -  c*(yt  -  z'j*  = 


-  z 


If  y'i  =  0,  then  x  and  y  are  identical,  contrary  to  the  hypothesis  that 
I2(xy)  <  0.  Taking  then  y'±  ^  0,  dividing  it  out  in  (2),  squaring  both 
sides  and  cancelling,  we  infer: 


whence 

*;  =  o, 

which  establishes  the  collinearity  in  /'  of  the  three  points,  since  their 
spatial  coordinates  coincide,  and  obviously  /'(y)  is  between  f'(x)  and  f'(z). 

Case  2.     I*(yz)  <  0.  Proof  similar  to  Case  1. 

Case  3.     ff(xz)  <  0.  By  an  argument  similar  to  that  given  for  Case  1, 
we  may  go  from  /  to  a  frame  /'  by  a  Lorentz  transformation  which  will 


AXIOMS    FOR   RELATIVISTIC    KINEMATICS  297 

transform  the  inertial  segment  f(x)  —  f(z)  to  "rest."  That  is,  we  obtain: 
/'(*)  =  <0,  0,  0,  0>, 

/  \y  i  ~~~~  \y  i>    *    •>  •A*'* 

tf(z)  =  <o,  o,  o,  *;>. 

Then  by  the  additive  hypothesis  of  the  theorem: 

Proceeding  as  before,  by  squaring  and  cancelling,  we  obtain  from  (3) : 


Squaring  again  and  cancelling  yields: 
(5)  y;  V  =  0. 

There  are  now  two  possibilities  to  consider:  either  y[  —  0  or  z'±  —  0. 
If  the  former  is  the  case,  then  the  three  points  are  collinear  in  R%,  for  they 
are  all  three  placed  at  the  origin  of  the  spatial  coordinates.  On  the  other 
hand,  if  z±  =  0,  then  x  and  z  arc  identical  points,  contrary  to  hypothesis. 
Again  it  is  obvious  that  f'(y)  is  between  f'(x)  and  f'(z).  Q.E.D. 

That  a  full  converse  of  Theorem  2  cannot  be  proved,  in  other  words 
that  the  additive  hypothesis 


does  not  imply  collinearity,  is  shown  by  the  following  counterexample: 

/(*)  =  <0,  0,  0,  0>, 
f(y)  =  <!,  1,0,  0> 
f(z)  = 


Clearly,  f(x)t  f(y)  and  f(z)  are  not  collinear  in  /?4,  but  I/(xy)  +If(yz]  =If(xz), 
that  is, 


(1)  A/2  +  V(i  _  V2c)2  +  1  -  c2  = 


For,  simplifying  and  rearranging  (1),  we  see  it  is  equivalent  to: 
(2)  A/2  —  2\/2c  +  c2  =  c  -  V2 


298  PATRICK   SUPPES 

and  the  left-hand  of  (2)  is  simply 

V(c  -"V2J2-^-  A/2. 


(It  may  be  mentioned  that  the  full  converse  of  Theorem  2  does  hold  for 
R<2,  that  is,  when  there  is  a  restriction  to  one  spatial  dimension.) 

We  now  want  to  prove  some  theorems  about  properties  which  are 
invariant  in  ^.  Formally,  a  property  is  invariant  in  $  if  and  only  if  it 
holds  or  does  not  hold  uniformly  for  every  member  /  of  $.  Thus  to  say 
that  the  property  of  a  line  being  an  inertial  path  is  invariant  in  fj  means 
that  a  line  with  respect  to  /  in  $,  is  an  inertial  path  with  respect  to  /  if 
and  only  if  it  is  an  inertial  path  with  respect  to  every  /'  in  $.  All  geometric 
objects  referred  to  here  are  with  respect  to  the  frames  in  gf. 

THEOREM  4.  The  property  of  being  the  midpoint  of  a  finite  segment  of  an 
inertial  path  is  invariant  in  $. 

PROOF  :     Suppose  x,  y  and  z  lie  on  an  inertial  path  with  respect  to  /  and 

(i)  t(y]  =  i/M  +  */(*), 

and  thus 

f(y)  -  /M  =  ir/(-)  -  /Ml- 

Consequently  by  virtue  of  Theorem  1 

(2)  lf(xy) 
and  similarly 

(3)  lf(yz)  =  $ 
whence 

(4)  If(Xy)  +  lf(yz)  =  /,(«). 

Now  by  the  invariance  axiom  of  Definition  3,  for  any  /'  in 

If>(xy)  =  If(xy) 

If(yz)  =  If(yz] 

Ir(xz)  =  If(xz). 
Substituting  these  identities  in  (4)  we  obtain: 

ir(*y)  +  ir<y*)  =  '/-(**)• 


AXIOMS    FOR   RELATIVISTIC    KINEMATICS  299 

Thus  by  virtue  of  Theorem  3,  f'(x),  f'(y)  and  f'(z)  are  collinear  with  f'(y) 
between  f'(x)  and  f(z).  Moreover,  since  by  the  invariance  axiom  (2)  and 
(3)  hold  for  /',  we  conclude  f'(y)  is  actually  the  midpoint.  Q.E.D. 

This  proof  is  easily  extended  to  show  that  the  property  of  being  an 
inertial  path  is  invariant  in  $f,  but  we  do  not  directly  need  this  fact. 
We  next  want  to  show  that  this  midpoint  property  is  invariant  for 
arbitrary  segments.  In  view  of  the  counterexample  following  Theorem  3 
it  is  evident  that  a  direct  proof  in  terms  of  the  relativistic  intervals 
cannot  be  given.  The  method  we  shall  use  consists  essentially  of  con- 
structing a  parallelogram  whose  sides  are  segments  of  inertial  paths.  A 
similar  but  somewhat  more  complicated  proof  is  given  in  Rubin  and 
Suppes  [3]. 

THEOREM  5.  The  property  of  being  the  midpoint  of  an  arbitrary  finite 
segment  is  invariant  in  £f  • 

PROOF:  Let  A  =  <Zi,  /i>  and  B  =  <Z2,  /2>  where  A  is  an  arbitrary 
segment  in  7\^.  (The  points  A  to  G  defined  here  are  with  respect  to  /  in 
$.)  For  definiteness  assume  t\  >  t%.  We  set 

and  we  choose  to  and  t$  so  that 

,        .         \Zi  -  Zt\ 

to  <  1 2 


2c 


"  '  2c 

We  now  let  (see  Figure  1) 

C  =  <Zo,  A)>,          D  =  <Zo,  ^3>> 


V 

A  +  B 

p 

B  +  D 

s~* 

2       ' 
A  +C 

2 

300 


PATRICK   SUPPES 


Fig.  1 


Denoting  now  the  same  points 
with  respect  to  /'  in  gf  by  primes, 
we  have  by  virtue  of  this  con- 
struction in  /  and  the  invariance 
property  of  Theorem  4, 


(1) 
(2) 
(3) 
(4) 

Substituting  (2)  and  (3)  into  (4) 
we  have: 


E'  =  i(C'  +  Z>'), 
F'  =  \(B'  +  D'), 
G'  =  \(A'  +  C'), 
E'  =  J(F  +  G'). 


Now  substituting  (1)  into  the 
right-hand  side  of  the  last 
equation  and  simplifying,  we  infer 
the  desired  result  : 


since  by  construction  E=  \(A  -\-B)  . 
Thus  the  midpoint  of  an  arbitrary  segment  is  preserved.     Q.E.D. 

THEOREM  6.  The  property  of  two  finite  segments  of  inertial  paths  being 
parallel  and  in  a  fixed  ratio  is  invariant  in  ^f. 

PROOF:  Let  /(*)  -  f(y)  =  k[f(u)  -  f(v)],  with  /(*)  -  f(y)  and 
f(u)  —  f(v)  segments  of  inertial  paths.  Without  loss  of  generality  we  may 
assume  k  >  1.  Let  z  be  the  point  such  that  f(x)  —  f(y)  =  k[f(x)  —  f(z)]. 
We  now  construct  a  parallelogram  with  f(u)  —  f(v)  and  f(x)  —  f(z)  as  two 
parallel  sides.  By  the  previous  theorem  any  parallelogram  in  /  is  carried 
into  a  parallelogram  in  /'  since  the  midpoint  of  the  diagonals  is  preserved. 
Thus 


but  by  Theorems  2  and  3 

(2)  /'(*)  -  f'(y)  =  *[/'(*)  -  /'(*)], 


AXIOMS   FOR   RELATIVISTIC   KINEMATICS  301 

(for  details  see  proof  of  Theorem  4),  whence  from  (1)  and  (2) 

/'(*)  -  /'(y)  =  *[/»  -  /W  Q.E.D. 

As  the  final  theorem  about  properties  invariant  in  gf,  we  want  to 
generalize  the  preceding  theorem  to  arbitrary  finite  segments. 

THEOREM  7.     The  property  of  two  arbitrary  finite  segments  being  parallel 
and  in  a  fixed  ratio  is  invariant  in  %. 

PROOF:     In  view  of  preceding  theorems,  the  crucial  thing  to  show  is 
that  if 


then 

/'(*)  -  f'(y)  =  *[/'(*)  -  /'(*)]• 

Our  approach  is  to  use  an  "inertial"  parallelogram  similar  to  the  one  used 
in  the  proof  of  Theorem  5.  In  fact  an  exactly  similar  construction  will  be 
used;  points  A  to  E  are  constructed  identically,  where  A  =  f(x)  and 
B  =  f(y).  Without  loss  of  generality  we  may  assume  k  >  2,  that  is, 
that  f(z)  =  F  is  between  A  and  E.  We  then  have  that 

(1)  A-E=^[A-F]. 

We  draw  through  F  a  line  parallel  to  CD,  which  cuts  AC  at  G  and  AD 
at  H.  (See  Figure  2.) 
Now  (  1  )  is  equivalent  to  : 


(2) 

Moreover,  by  construction 

(3)  F  =  i(G  +  H) 

(4)  £  =  t(C  +  D) 

(5) 


(6)  H 

Since  GF//,  AGC,  AHD  and  C£D  are  by  construction  segments  of  in- 


302 


PATRICK   SUPPES 


ertial  paths,  by  virtue  of  Theorem 
7,  we  have  from  (3)-(6) : 

(7)  F'  =  $(G'  +  H1) 

(8)  E'  =  -1(6"  +  /)') 

(9)  C'  = 


(10) 


Substituting  (9)  and  (10)  in  (7), 
we  get  : 


And  now  substituting  (8)  in  (11), 
we  obtain  the  desired  result  : 


(12) 


But  now  by  virtue  of  Theorem  5 

£'  =  \(A'  +  B'), 
which  together  with  (12)  yields: 


k 


which  is  equivalent  to  : 


(13) 


/'(*)  -  f'(y)  =  *[/'(*)  -  f'(z)\. 


The  remainder  of  the  proof,  based  upon  considering  f(x)  —  f(y)  = 
k[f(u)  —  f(v)],  is  exactly  like  that  of  Theorem  6  and  may  be  omitted.  (In 
place  of  Theorems  2  and  3  in  that  proof  we  use  the  result  just  established.) 

Q.E.D. 


We  now  state  the  theorem  toward  which  the  preceding  seven  have  been 
directed. 


AXIOMS    FOR   RELATIVISTIC    KINEMATICS  303 

THEOREM  8.  Any  two  frames  in  gf  are  related  by  a  non-singular  a/fine 
transformation  . 

PROOF  :  A  familiar  necessary  and  sufficient  condition  that  a  transfor- 
mation of  a  vector  space  be  affine  is  that  parallel  finite  segments  with  a 
fixed  ratio  be  carried  into  parallel  segments  with  the  same  fixed  ratio. 
(See,  e.g.  Birkhoff  and  MacLane  [1,  p.  263].)  Hence  by  virtue  of  Theorem 
7  any  two  frames  are  related  by  an  affine  transformation.  Non-singu- 
larity of  the  transformation  follows  from  the  fact  that  each  frame  in  g  is  a 
one-one  mapping  of  X  onto  R$.  Q.E.D. 

Once  we  have  any  two  frames  in  $  related  by  an  affine  transformation, 
it  is  not  difficult  to  proceed  to  show  that  they  are  related  by  a  Lorentz 
transformation.  In  the  proof  of  this  latter  fact,  it  is  convenient  to  use  a 
Lemma  about  Lorentz  matrices,  which  is  proved  in  Rubin  and  Suppes  [3], 
and  is  simply  a  matter  of  direct  computation. 

LEMMA  1  .     A  matrix  *£/  (of  order  4)  is  a  Lorentz  matrix  if  and  only  if 

0    \  (J       0 


We  now  prove  the  basic  result  : 

THEOREM  9.  Any  two  frames  in  3f  are  related  by  a  Lorentz  transfor- 
mation. 

PROOF:  Let  /,  /'  be  two  frames  in  3f.  As  before,  for  x  in  X,  f(x)  =  x, 
/!_(#)  —  xi,  f'(x)  —  x'  t  etc.  We  consider  the  transformation  <p  such  that 
for  every  x  in  X,  y(x)  =  xf.  By  virtue  of  Theorem  8  there  is  a  non-singular 
matrix  (of  order  4)  and  a  four-dimensional  vector  B  such  that  for  every  x 
mX 

<p(x)  =  xs/  +  B. 

The  proof  reduces  to  showing  that  j/  is  a  Lorentz  matrix. 
Let 


* 

And  let  a  be  a  light  line  (in  /)  such  that  for  any  two  distinct  points  x  and  y 


304  PATRICK   SUPPES 

of  a  if  x  =  <Zi,  £i>  and  y  =  <Z2,  22>,  then 
Clearly  \W\  =  c.  Now  let 

(3)  w'  =  ^t^T' 

From  (1),  (2)  and  (3)  we  have: 
(4)  W 


(Zi  -  Z2)£*  +  (h  - 
Dividing  all  terms  on  the  right  of  (4)  by  ti  —  /2,  and  using  (2),  we  obtain : 

W&  +  F 


(5) 


W  = 


WE*  +  g 


At  this  point  in  the  argument  we  need  to  know  that  \W\  =  c,  that  is 
to  say,  we  need  to  know  that  if  I/(xy)  =  0,  then  If(xy)  =  0.  The  proof  of 
this  fact  is  not  difficult.  From  our  fundamental  in  variance  axiom  we  have 
that  If(xy)  >  0,  that  is, 

(6)  \W'\>c. 


Consider  now  a  sequence  of  inertial  lines  «i, 
W2,  .  .  .  are  such  that 

(7)  lim  Wn  =  W. 

n-*oo 

Now  corresponding  to  (5)  we  have : 

(8)  \W'n\  = 


whose  slopes 


+  g 


<  C. 


Whence,  from  (8)  we  conclude  that  if  WE*  +  g  ^  0,  then 

(9)  |^'|  =  |lhn<|<c. 

Thus  from  (6)  and  (9)  we  infer 

(10)  .  \W'\  =  c, 


AXIOMS   FOR   RELATIVISTIC    KINEMATICS  305 

if  WE*  +  g  ^  0,  but  that  this  is  so  is  easily  seen.  For,  suppose  not.  Then 

\im(WnE*  +  g)  =  0, 

n-voo 

and  thus 

+  F)  =  0. 


Consequently  W&  +  F  =  0,  and  <W,  1>J2/  =  0,  which  is  absurd  in  view 
of  the  non-singularity  of  &/. 

Since  \W'\  =  c,  we  have  by  squaring  (5)  : 

*  +  2W&F*  +  |F|2 


and  consequently 

(12)      W(®@*  -  c*E*E)W* 


Since  (12)  holds  for  an  arbitrary  light  line,  we  may  replace  W  by  —  W, 
and  obtain  (12)  again.  We  thus  infer: 


)  =  0, 

but  the  direction  of  W  is  arbitrary,  whence 
(13)  9F*  -C2£*g-0. 

Now  let  x  =  <0,  0,  0,  0>  and  y  =  <0,  0,  0,  1  >.  Then 


But  it  is  easily  seen  from  (1)  that 


and  thus  by  our  fundamental  invariance  axiom 

(14)  C2g2  -  |F|2  =  C2. 

From  (12),  (13),  (14)  and  the  fact  that  \W\2  =  c2,  we  infer: 
W(^*  -  c*E*E)W*  =  |P7|2, 

and  because  the  direction  of  PF  is  arbitrary  we  conclude  : 

(15)  &@* 
where  J  is  the  identity  matrix. 


306  PATRICK   SUPPES 

Now  by  direct  computation  on  the  basis  of  (1), 

A/        0   \         __  /  &&*  -  c*E*E 
<16'  V  0     -  c2/         "  \(@F*  -  c2£*g)*    FF*  -  c2g2 

From  (13),  (14),  (15)  and  (16)  we  arrive  finally  at  the  result: 


-c2/  -c 

and  thus  by  virtue  of  Lemma  1  ,  j/  is  a  Lorentz  matrix.     Q.E.D. 

4.  Temporal  Parity.  Turning  now  to  problems  of  parity,  we  may  for 
simplicity  restrict  the  discussion  to  time  reversals.  Similar  considerations 
apply  to  spatial  reflections. 

A  simple  axiom,  which  will  prevent  time  reversal  between  frames  in  ft, 
is: 

(Tl)      There  are  elements  x  and  y  in  X  such  that  for  all  f  in  $ 


There  is,  however,  a  simple  objection  to  this  axiom.  It  is  unsatisfactory 
to  have  time  reversal  depend  on  the  existence  of  special  space-time  points, 
which  could  possibly  occur  only  in  some  remote  region  or  epoch.  This 
objection  is  met  by  T2. 

(T2)      //  I*f(xy)  <  0  then  either  for  all  f  in  ft 

f4(x)  <  My) 
or  for  all  f  in  ft 

My)  <U(x). 

T2  replaces  the  postulation  of  special  points  by  a  general  property  :  given 
any  segment  of  an  inert  ial  path,  all  frames  in  ft  must  orient  the  direction 
of  time  for  this  segment  in  the  same  way. 

Nevertheless,  there  is  another  objection  to  Tl  which  holds  also  for  T2: 
the  appropriate  axiom  should  be  formulated  so  that  a  given  observer  in  a 
frame  /  may  verify  it  without  observing  any  other  frames,  that  is,  he  may 
decide  if  he  is  a  qualified  candidate  for  membership  in  ft  without  ob- 
serving other  members  of  ft.  (This  issue  is  relevant  to  the  single  axiom  of 
Definition  3  but  cannot  be  entered  into  here.)  From  a  logical  standpoint 
this  means  eliminating  quantification  over  elements  of  ft,  which  may  be 


AXIOMS   FOR   RELATIVISTIC   KINEMATICS  307 

done  by  introducing  a  fourth  primitive  notion,  a  binary  relation  a  of 
signaling  on  X.  To  block  time  reversal  we  need  postulate  but  two  proper- 
ties of  a: 

(T3.1)     For  every  x  in  X  there  is  a  y  in  X  such  that  xay. 
(T3.2)     //  xay  then  /4(*)  <  /4(y). 

However,  a  third  objection  to  (Tl)  also  applies  to  (T2)  and  (T3). 
Namely,  we  are  essentially  postulating  what  we  want  to  prove.  The 
axioms  stated  here  correspond  to  postulating  artifically  in  a  theory  of 
measurement  of  mass  that  a  certain  object  must  be  assigned  the  mass  of 
one.  I  pose  the  question:  Is  it  possible  to  find  "natural"  axioms  which  fix  a 
direction  of  time?  It  may  be  mentioned  that  Robb's  meticulous  axio- 
matization  [2]  in  terms  of  the  notion  of  after  provides  no  answer. 


References 

[1]    BIRKHOFF,  G.  and  S.  MACLANE,  A  Survey  of  Modern  Algebra.  New  York  1941, 

XH  450  pp. 

[2]     ROBB,  A.  A.,  Geometry  of  Space  and  Time.  Cambridge  1936,  VII  -\-  408  pp. 
[3]     RUBIN,  H.  and  P.  SUPPES,   Transformations  of  systems  of  relativistic  particle 

mechanics.  Pacific  Journal  of  Mathematics,  vol.  4  (1954),  pp.  563-601. 


Symposium  on  the  Axiomatic  Method 


AXIOMS  FOR  COSMOLOGY 

A.  G.  WALKER 

University  of  Liverpool,  Liverpool,  England 

1.  In  relativistic  cosmology  there  is  a  generally  accepted  form  of 
space-time  which  serves  as  a  geometrical  model  for  the  large  scale  features 
of  the  universe.  It  is  a  four  dimensional  manifold  with  a  quadratic  differ- 
ential metric 

dP  —  R2da2 

where  t  is  a  preferential  coordinate,  7^  is  a  function  of  t  only,  and  da2  is 
the  metric  of  a  three  dimensional  Riemannian  space  C  of  constant 
curvature  k.  Topologically  the  space-time  is  a  product  T  X  C  where  T  is 
the  continuum  of  real  numbers  (parametrised  by  t)  and  C  is  a  3-space 
which  may  be  spherical  or  elliptic  (k  >  0),  hyperbolic  (k  <  0)  or  euclidean 
(k  =  0) .  Each  point  x  of  C  represents  a  fundamental  particle  (corresponding 
to  a  galaxy  in  the  universe);  the  curve  T  x  x,  an  orthogonal  trajectory 
of  the  hypersurfaces  t  =  constant  in  space-time,  is  the  world  line  of  the 
particle,  and  the  null  geodesies  of  spacetimc  represent  light  paths.  The 
natural  projection  of  each  such  null  geodesic  into  C  is  a  geodesic  of  C. 

In  dynamical  theories,  such  as  General  Relativity,  the  f onn  of  space- time 
is  strictly  invariant  and  the  function  R  is  significant  in  that,  through  the 
field  equations,  it  determines  the  distribution  of  matter  in  the  universe. 
In  kinematical  theories,  however,  space-time  is  only  conformally  in- 
variant with  the  result  that  R  can  be  'transformed  away'  by  aregraduation 
of  the  time  scale,  i.e.  a  transformation  of  the  time  parameter  from  t  to  r 
where  dr  =  dt/R.  Ignoring  a  conformal  factor,  the  metric  of  space-time 
then  becomes  dr2  —  da2  and  the  model  is  static.  The  r-scale  giving  this 
comparatively  simple  model  is  unique  except  for  an  arbitrary  affine 
transformation  r  =  ar  +  b  (a  >  0) ;  with  each  r-scale  there  is  a  'natural' 
measure  of  distance  in  C,  by  fda,  and  the  only  effect  of  a  regraduation 
T'  =  ar  +  b  is  a  change  of  distance  unit  by  a  factor  a.  With  these  measure- 
ments of  time  and  distance  light  can  be  said  to  have  unit  speed  in  C. 

The  above  model,  which  is  common  to  many  theories,  was  derived  first 
by  Lemaitre  and  others  from  Einstein's  General  Theory,  when  it  was 
interpreted  as  a  dynamical  as  well  as  a  kinematical  model.  Later  it  was 

308 


AXIOMS  FOR  COSMOLOGY  309 

derived  by  Robertson  and  the  writer  independently  as  a  purely  kine- 
matical  model,  based  on  fewer  assumptions  than  in  General  Relativity, 
and  this  derivation  is  generally  regarded  as  satisfactory  and  adequate  in 
modern  cosmological  theories.  Nevertheless  it  is  far  from  satisfactory  as 
an  example  of  the  axiomatic  method  largely  because  of  the  initial  as- 
sumption that  events  can  be  described  by  numerical  parameters,  i.e.  that 
the  natural  topology  of  a  geometrical  model  of  the  universe  is  that  of  a 
manifold.  This  is  a  good  working  hypothesis  in  that  it  produces  useful 
results  quickly,  but  we  now  wish  to  base  the  structure  on  a  more  ele- 
mentary set  of  axioms. 

We  shall  be  talking  about  particles  (fundamental  particles)  and  the 
events  in  the  history  of  a  particle,  and  the  present  purpose  is  to  find  a 
system  of  axioms  from  which  we  can  deduce  two  theorems;  firstly,  that 
the  events  in  the  history  of  a  single  particle  are  'linearly'  ordered,  i.e. 
can  be  parametrised  by  a  single  real  parameter;  secondly,  that  the  set  of 
particles  can  be  given  the  topology  and  structure  of  a  geodesic  metric 
space  in  such  a  way  that  the  metric  has  the  properties  of  metric  in  the 
cosmological  model.  We  shall  not  go  all  the  way  in  establishing  the  spheri- 
cal, elliptic,  hyperbolic  or  euclidean  manifold  structures  on  the  set  of 
particles,  but  the  final  stage  is  not  difficult  once  the  geodesic  metric 
structure  is  established,  using  the  work  of  Busemann,  Montgomery  and 
Zippin  and  postulating  sufficient  symmetry  about  each  particle.  Our 
axiomatic  system  will  in  fact  cut  out  the  spherical  and  elliptic  models 
since  it  will  be  postulated  that  the  (light)  signal  correspondence  from  one 
particle  to  another  is  one-one.  It  would  need  a  more  complicated  system 
to  include  the  models  of  positive  curvature  and  this  is  not  discussed  here. 

2.  Before  the  axioms  are  stated  the  idea  of  light  signals  used  to  such 
good  affect  by  E.  A.  Milne  [2]  and  others  will  be  described  briefly.  One  of 
the  primitives  in  the  present  System,  the  signal-mapping  of  one  particle 
set  of  events  (world-line)  on  another,  is  based  on  this  idea,  and  one  of  the 
axioms  appears  artificial  until  it  is  related  to  the  situation  of  equivalence 
between  particle-observers  discussed  by  Milne. 

Milne's  particle-observer  is  the  set  of  events  in  the  history  of  a  particle 
together  with  a  'clock',  i.e.  a  numerical  parameter  giving  temporal  order 
in  the  set.  If  A  and  B  are  two  particle-observers,  light  signals  can  be  sent 
from  one  to  the  other;  the  time  of  arrival  s'  at  B  can  be  expressed  as  a 
function  s'  =  0(t)  of  the  time  t  of  emission  at  A,  and  the  time  of  arrival  t' 
at  A  is  a  function  t'  =  8(s)  of  the  time  s  of  emission  at  B.  These  'times'  at 


310 


A.    G.    WALKER 


A  and  B  are  recorded  by  the  clocks  attached  to  A  and  B.  The  particle- 
observers  (with  their  clocks)  are  said  to  be  equivalent  if  the  functions  0  and 
0,  called  signal  functions,  are  identical,  and  it  can  be  shown  that  if  A  and 
B  are  not  equivalent,  then  B's  clock  can  be  regraduated  by  a  transfor- 
mation of  the  form  sf  =  y(s)  so  that  they  become  equivalent. 

Three  particle-observers  A,  B,  C  are  collinear,  with  B  between  A  and  C, 
if  the  light  signal  from  A  to  C  is  the  same  as  that  from  A  to  B  followed  by 
the  signal  from  B  to  C,  and  similarly  from  C  to  A.  They  then  form  an 
equivalent  system  if  they  are  equivalent  in  pairs,  and  it  is  easily  verified 
that  if  0  and  </>  are  the  signal  functions  between  A  and  B  and  between  A 
and  C  respectively,  the  condition  for  this  \sO  °  (f>  =  <f>  °  6.  A  collinear  set  of 
particle-observers  equivalent  in  pairs  thus  gives  rise  to  a  set  of  commu- 
tative signal  functions,  and  from  the  study  of  such  a  set  Milne  was  able 

to  establish  theorems  on  linear  equiva- 
lences. 

One  serious  disadvantage  of  this 
treatment  is  the  assumption  that  particle- 
observers  can  communicate  with  each 
other  so  that  a  signal  function  is  assumed 
to  be  knowable.  It  would  be  difficult  to 
embody  this  assumption  in  an  axiomatic 
system  and  for  that  reason  it  will  be  as- 
sumed in  the  present  work  that  all  obser- 
vations are  to  be  made  by  only  one  observer. 
Thus  if  A  is  this  observer  and  if  A  and 
B  are  equivalent  in  Milne's  sense.  A  cannot 
observe  the  signal  function  6  between  A 
and  B  but  he  can  observe  the  function 
02  _  Q  o  0  since  if  a  light  signal  emitted 
by  A  at  time  /  is  reflected  at  B  and  is 
received  by  A  at  time  tf,  then  t'  is  an 
observable  function  of  t  given  by  t'  =  02(t). 
Again,  if  collinear  particles  A,  B,  C  are 
equivalent  in  Milne's  sense  and  if  0,  <f>  are 

the  signal  functions  between  A  and  B  and  between  A  and  C  as  before, 
then  0  o  y  =  <p  o  0 ;  but  if  A  is  the  only  observer,  this  is  not  an  observable  re- 
lation. A  consequence,  however,  is  O2  °  y2  =  <f>2  o  O2,  and  this  is  an  observable 
relation  since  O2  and  <p  are  observable.  This  simple  relation  is  independent 
of  the  choice  of  clock  scales  and  can  be  illustrated  as  in  fig.  1 ,  where  the 


Fig.  1 


AXIOMS  FOR  COSMOLOGY  311 

vertical  lines  represent  world-lines  and  the  other  lines  light  paths.  Although 
derived  from  Milne's  idea  of  equivalence  it  is  in  fact  weaker,  and  provides 
the  main  suggestion  for  our  Axiom  IX  which  is  equivalence  to  it  when 
applied  to  collinear  particles. 

3.  The  primitives  of  the  axiomatic  system  to  be  considered  here  are 
events  and  certain  sets  of  events  called  particles.  The  events  of  one  particle 
O,  called  the  observer,  satisfy  a  total  order  relation,  described  by  the  words 
'before'  and  'after' ;  if  %  and  y  are  distinct  events  of  0,  then  either  x  <  y 
(x  is  before  y,  equivalent  to  y  >  x,  i.e.  y  is  after  x)  or  y  <  x.  Lastly,  if  A 
and  B  are  any  two  particles,  there  is  a  signal-mapping  of  A  onto  By 
denoted  by  (ASB). 

AXIOM  I.  The  order  relation  in  0  is  transitive,  i.e.  if  x,  y,  z,  are  events 
of  0  such  that  x  <  y  and  y  <  x,  then  x  <  z. 

AXIOM  II.     Every  signal-mapping  is  one-one. 

Thus  (A,  B)  has  a  single  valued  inverse  (A,  B)~l  which  is  a  mapping  of 
B  onto  A. 

DEFINITION.  An  OBSERVABLE  is  a  mapping  0  ->  0  resulting  from  a 
chain  of  signal  mappings  or  inverse  signal  mappings. 

One  example  of  an  observable  is  the  signal-mapping  (0,  A)  followed  by 
the  mapping  (A,  0}\  this  will  be  denoted  by  (0,  A) (A,  0),  which  is  here 
more  convenient  than  the  usual  (A,  0)  °  (0,  A).  Another  example  is 
(0,  A)(A,  B)(O,  B)~l,  which  will  turn  out  later  to  be  the  identity  mapping 
O  ->  0  when  0,  A,  B  are  'collinear'. 

All  further  axioms  can  now  be  expressed  in  terms  of  observables  and 
the  order  relation  on  0,  but  for  convenience  we  shall  define  and  use 
relative  observables. 

By  means  of  the  mapping  (0,  A)  and  the  order  relation  on  0,  an  order 
relation  can  be  induced  on  A,  and  we  shall  use  the  symbols  <,  >  and  words 
'before'  and  'after'  when  describing  this  relation.  An  OBSERVABLE 
RELATIVE  TO  A  is  defined  as  a  mapping  A  ->  A  resulting  from  a  chain  of 
signal  mappings  and  inverses,  and  an  axiom  may  be  expressed  in  terms  of 
observables  relative  to  any  particle  A  and  the  order  relation  on  A.  This 
is  only  a  matter  of  convenience;  any  such  expression  could  always  be 
restated  in  terms  of  proper  observables,  i.e.  observables  relative  to  0,  for 
if  /  is  an  observable  relative  to  A ,  then  (0,  A)  f  (0,  A)~l  is  a  corresponding 
proper  observable. 


312 


A.    G.    WALKER 


Let  A,  B,  C  be  any  three  particles,  and 
let  g  be  the  observable  relative  to  A  defined 
as  follows  (see  fig.  2) 

g=(A,B)(B,C)(A,C)-i. 
AXIOM  III.     g(a)  >  a  for  all  events  aeA. 

AXIOM  IV.  g  is  strictly  increasing,  i.e. 
if  a,  a'  are  events  of  A  such  that  a'  >  a, 
then  g(a')  >  g(a). 

It  is  to  be  understood  that  in  axioms 
such  as  these  the  particles  are  not  neces- 
sarily distinct,  i.e.  B  or  C  may  be  a  copy 
of  A ,  with  the  convention  that  the  signal 
mapping  (A,  A)  is  the  identity  A  ->A. 
Putting  C  —  A  in  Axioms  III  and  IV  we 
thus  have  that  the  observable  /  relative  to 

A,  defined  by  /  —  (A,B)(B,  A),  satisfies  the  same  conditions  as  g  in  the 
axioms. 

We  also  see  from  Axiom  IV  with  A  =  0  that  if  B  and  C  are  any  two 
particles,  the  order  relation  induced  in  C  from  B  by  means  of  the  mapping 
(B,  C)  is  the  same  as  the  order  induced  in  C  from  0.  It  follows  that  the 
observer  0  loses  its  preferential  position;  the  whole  system  is  the  same 
relative  to  a  'subordinate  observer'  at  A  with  the  order  relation  induced 
from  0. 

It  can  now  be  assumed  without  a  further  axiom  that  the  ordered  set  0  is 
closed,  and  therefore  that  every  particle  set  is  closed,  i.e.  that  every 
bounded  sequence  of  events  in  a  particle  has  a  limit.  If  the  particle  sets 
are  not  closed,  new  events  can  be  defined  in  the  usual  way  as  sections  or 
by  sequences,  and  the  sets  of  new  events  are  closed.  Further,  the  signal 
mappings  can  be  extended  in  a  natural  way  to  the  new  particle-sets  so 
that  the  above  axioms  are  still  satisfied.  It  will  therefore  be  assumed  that 
O,  and  hence  every  particle  set  of  events,  is  closed. 

DEFINITION.  Particles  A  and  B  COINCIDE  at  the  event  a  e  A  if  f(a)  =  a 
where  f  =  (A,B)(B9A). 

We  see  that  if  A  and  B  coincide  at  a  e  A  and  if  (A ,  B)  (a)  =  b,  then  A 
and  B  coincide  at  the  event  b  e  B ;  f  or  we  have  f(a)  =  (B,  A )  »  (A ,  B)  (a)  = 
(Bt  A)(b)  and  hence  (B,  A)(A,  B)(b)  =  (A,  B)  o  (B,  A)b  =  (A,  B)(a)  =  b. 


AXIOMS  FOR  COSMOLOGY  313 

We  could,  if  we  wished  to  follow  the  mechanical  picture,  regard  a  and 
b  here  as  the  same  event  and  so  allow  particle  sets  to  intersect.  This  is 
unnecessary,  however,  because  our  particles  are  restricted  to  correspond 
to  what  were  formerly  called  fundamental  particles  and  therefore  are 
required  not  to  coincide,  which  leads  to  the  next  axiom. 

AXIOM  V.     No  two  distinct  particles  coincide  at  any  event.  * 

It  follows  that  no  particle  set  has  a  first  or  last  event,  (assuming  that 
there  is  more  than  one  particle ;  see  Axiom  VII)  2.  For  if  a  is  a  last  event  of 
a  particle  A  and  B  is  another  particle,  then  by  Axiom  III  with  C  =  A, 
f(a)  ;>  a  where  f  =  (A,  B)(B,  A)  and  hence  f(a]  =  a,  i.e.  A  and  B  coin- 
cide at  a.  Similarly  A  and  B  would  coincide  at  a  first  event  of  A. 

4.  DEFINITION.  Particles  A,  B,  C  are  COLLINKAR,  with  B  between  A 
and  C,  if 

(A,  B)(B,  C)  =  (A,  C),     (C,  B)(B,  A)  =  (C,  A). 

These  conditions  can  be  expressed  in  terms  of  observables  and  re  quires 
the  observables  relative  to  A,  given  by  (A,  B)(B,  C)(A,  C)"1  and 
(C,  A)~l(C,  B)(B,  A),  both  to  be  the  identity  mapping  A  ->  A.  They  cor- 
respond, therefore,  to  the  case  of  equality  in  Axiom  III. 

AXIOM  VI.  If  A,  B,  C,  D  are  particles  with  A,  B,  C  collinear  in  some 
order,  A,  B,  D  collinear  in  some  order,  and  A,  B  distinct,  then  A,  C,  D  are 
collinear  in  some  order. 

It  follows  from  this  that  the  set  of  all  particles  collinear,  in  some  order, 
with  two  distinct  particles  is  a  linear  system  which  is  determined  by  any 
two  distinct  members.  From  the  'between'  relation  in  the  above  definition 
and  Axiom  III  we  see  that  a  linear  system  is  totally  ordered.  In  particular 
we  can  talk  about  particles  of  a  linear  system  being  on  the  'same  side'  or 
on  'opposite  sides'  of  a  member  of  the  system. 

1  This  axiom  is  in  fact  redundant,  but  to  do  without  it  would  mean  a  great  deal 
of  additional  work.  A  theorem  equivalent  to  this  axiom  is  proved  in  [4]  where  also 
Axiom  VIII  is  weakened. 

2  We  could,  of  course,  make  an  exception  of  first  and  last  events  in  Axiom  V, 
and  it  would  then  follow  that  if  one  particle  has,  for  example,  a  first  event  then  all 
particles  coincide  at  this  event,  which  is  what  happens  in  Milne's  model  with  the 
/-scale  of  time.  However,  extreme  events  of  this  kind  can  be  excluded  without 
affecting  the  system  and  the  form  of  Axiom  V  given  here  appears  to  be  preferable. 


314  A.    G.    WALKER 

DEFINITION.  A  linear  system  L  is  DENSE  at  a  particle  A  e  L  if,  for  any 
two  events  a,  a'  of  A  with  a'  >  a,  there  is  a  particle  B  e  L  distinct  from  A 
such  that  (A,  B)(B.  A) (a)  <  a'. 

It  is  not  difficult  to  prove  that,  if  L  is  dense  at  A,  there  is  a  sequence 
{An}  of  particles  in  L  and  on  the  same  side  of  A  such  that,  for  every  event 
a  e  A,  fn(a)  ->  a  as  n  -»  oo  where  fn  —  (A,  An)(An,  A).  We  shall  write 
An-+ A. 

It  should  be  noted  that  this  definition  of  denscness  in  a  linear  system  is 
stronger  than  denseness  in  the  ordinary  sense  for  a  totally  ordered  set 
since  it  involves  the  ordered  set  A  of  events.  One  consequence,  of  course, 
is  that  L  is  dense  at  A  in  the  ordinary  sense  that  if  B,  C  are  members  of  L 
with  A  between  them,  then  there  is  another  member  of  L  which  is  distinct 
from  A,  B,  C  and  between  B  and  C.  Another  consequence  is  as  follows. 

If  there  is  a  linear  system  of  particles  which  is  dense  at  some  member 
then  the  particle  set  O  of  events  (and  hence  every  particle  set)  is  continu- 
ous in  the  sense  that  if  x,  y  are  any  two  events  of  0  and  x  <  y,  there  is  an 
event  z  of  0  such  that  %  <  z  <  y. 

AXIOM  VII.     There  are  at  least  two  distinct  particles. 

Hence  there  is  at  least  one  linear  system  of  particles. 

AXIOM  VIII.     Every  linear  system  of  particles  is  dense  at  every  member  3. 

An  immediate  consequence  of  this  axiom,  as  remarked  above,  is  that 
every  particle  set  of  events  is  continuous.  We  are  now  in  a  position  to 
prove  the  following  theorem. 

The  particle  set  0,  and  hence  every  particle  set,  is  ordinally  equivalent  to  the 
continuum  of  real  numbers. 

It  is  sufficient  to  prove  this  for  any  one  particle  A ,  and  since  we  already 
have  that  the  set  A  is  closed,  it  is  sufficient  to  prove  that  there  is  an 
ennumerable  subset  of  A  such  that  between  any  two  events  of  A  is  a 
member  of  the  subset. 

Let  L  be  a  linear  system  containing  A .  Then  L  is  dense  at  A  by  Axiom 
VIII  and  there  is  a  subset  {An}  of  L  such  that  An  -*A.  As  before  we 
write  fn  for  (A,  An)(Anf  A)  and  define  fnv  for  any  integer  p  in  the  usual 

3  Because  of  an  axiom  of  symmetry  which  comes  later  (§  6),  Axiom  VIII  could  be 
weakened  to  state  that  every  linear  system  is  dense  at  some  member ;  it  would  then 
follow  from  symmetry  that  the  system  is  dense  at  every  member.  The  present  axiom 
is  chosen,  however,  so  that  certain  theorems  can  be  proved  immediately.  (Cf.  [4].) 


AXIOMS  FOR  COSMOLOGY  315 

way;  thus  fnQ  is  the  identity  mapping  A  -+A,fnl  =  fn,  fnv+l  =  fn  <>  fnP, 


Let  a  be  an  event  of  A  ,  and  consider  the  subset  of  A  given  by  the  events 
fnp(a)  where  n  takes  all  positive  integer  values  and  p  takes  all  integer 
values.  This  subset  is  ennumerable,  and  we  shall  prove  that  if  x,  y  are 
any  two  events  of  A  and  %  <  y,  there  is  a  member  of  this  subset  between 
x  and  y.  It  will  be  sufficient  to  consider  the  case  a  <  x  <  y;  the  proof  for 
the  case  x  <  y  <  a  is  very  similar,  and  the  case  x  <  a  <  y  is  trivial. 

Let  x,  y  be  events  of  A  and  a  <  x  <  y.  Then  since  A  n  ->  A  ,  there  is  an 
n  such  that  fn(x)  <  y.  Keeping  n  fixed,  consider  the  sequence  fnm(a) 
where  m  takes  positive  or  zero  integer  values.  This  sequence  is  unbounded 
as  m  ->  oo,  for  if  fnm(<*>)  ->  z  as  m  ->  oo  then  fn(z)  =  z  and  An  coincides 
with  A  at  the  event  z,  which  contradicts  Axiom  V.  Hence,  since  a  <  x. 
there  is  a  positive  or  zero  integer  m  such  that 

fnm(a)  <oc  </nmH1(«)- 
The  mapping  /  is  increasing  by  Axiom  IV  and  hence 

*  <  fnm+1(a)  =  /.(/."(a))  <  /„(*)  <  y 

i.e.  the  event  /nm+1(a)  in  the  ennumerable  subset  of  A  is  between  x  and  y, 
as  required. 

The  proof  for  x  <  y  <  a  is  similar  but  with  inverse  signal  mappings 

/.-». 

This  theorem  shows  that  every  particle  set  can  be  mapped  onto  the 
continuum  of  real  numbers  so  that  order  is  preserved  in  the  sense  that 
'before'  corresponds  to  'is  less  than'.  Such  a  mapping  will  be  called  a 
clock,  and  the  real  number  corresponding  to  an  event  is  a  clock  reading. 
The  mapping  is  not,  of  course,  unique,  and  a  change  of  mapping  may  be 
called  a  'clock  regraduation'  ;  it  corresponds  to  a  transformation  t'  =  y(t) 
of  the  'time'  parameter  t,  where  y)  is  a  continuous  increasing  function 
taking  all  values. 

When  a  particle  A  is  provided  with  a  clock  in  this  way,  every  observable 
relative  to  A  can  be  represented  as  a  function  of  the  time  parameter,  and 
from  the  axioms  it  follows  that  all  such  functions  are  continuous  and 
increasing  and  take  all  values.  If  /  is  such  a  function,  it  is  transformed  into 
^  o  /  o  y-i  when  A's  clock  is  regraduated  by  tr  =  y>(/). 

5.  The  next  axiom  was  suggested  by  a  property  of  Milne's  equivalent 
system  of  collinear  particle-observers  (see  §  2)  but  applies  to  any  three 
particles,  not  necessarily  collinear. 


316  A.    G.    WALKER 

AXIOM  IX.  If  A,  B,  C  are  any  three  particles  then  (A  :  B,  C)  = 
(A  :  C,  B)  where  (A  :  B,  C)  denotes  the  observable  (A,  B)(B,  C)(C,  B)(At  B}~^ 
relative  to  A. 

If  A,  Bt  C  are  collinear  with  A  not  between  B  and  C  this  axiom  is 
seen  to  be  already  satisfied.  If  however  A  is  between  B  and  C,  then 
(C,  B)(A,  B}-i  =  (C,  A)  and  the  axiom  gives 

(A,  B)(B,  C)(C,  A)  =  (A,  C)(C,  B)(B,  A) 
i.e. 

(A,  B)(B,  A)(A,  C)(C,  A)  =  (A,  C)(C,  A) (A,  B)(B9  A), 

showing  that  the  two  observables  (A,  B)(B,  A)  and  (A,  C)(C,  A)  relative 
to  A  commute. 

Again,  if  A,  B,  C  are  collinear  with  B  between  A  and  C,  then  the  axiom 
applied  to  B,  A,  C  in  this  order  gives 

(B,  A)(A9  C)(C,  B)  =  (B,  C)(C,  A)(A9  B) 
and  hence 

(A,  B)(B,  A)(Af  C)(C,  B)(B,  A)  =  (A,  B)  (B,  C)(C,  A) (A,  B)(B,  A) 

i.e. 

(A.  B)(B,  A)(A,  C)(C,  A]  =  (A,  C)(C,  A) (A,  B)(B,  A) 

since  (C,  B)(B,  A)  =  (C,  A)  and  (A,  B)(B,C)  =  (A,C).  Thus  the  ob- 
servables (A,  B)(B,  A)  and  (A,  C)(C,  A)  relative  to  A  commute  as  before. 
A  similar  result  occurs  if  C  is  between  A  and  B.  Hence: 

IiA,B,C  are  collinear  in  some  order,  the  observables  (A,B)(B,A) 
and  (A,  C)(C,  A)  relative  to  particle  A  commute. 

Consider  now  a  linear  system  L  of  particles  containing  A ,  and  for  X  e  L 
denote  by  fx  the  observable  (A,  X)(Xt  A]  relative  to  A.  Then  from  what 
has  just  been  proved,  if  X,  Y  are  any  two  members  of  L, 

fx  °  IY  =  fv  °  fx. 

Suppose  now  A  is  assigned  a  clock,  i.e.  a  'time'  parameter  t\  then  ob- 
servables such  as  fx  are  represented  by  continuous  increasing  functions 
1x(t]  taking  all  values,  and  any  two  such  functions  corresponding  to 
members  of  L  commute.  We  thus  have  a  system  L  of  commutative 
functions  which,  because  of  the  denseness  of  L  at  A ,  contains  a  sequence 
which  converges  uniformly  to  the  identity.  Hence,  from  a  theorem  on  sets 


AXIOMS  FOR  COSMOLOGY  317 

of  commutative  functions  [3],  there  exists  a  continuous  increasing  func- 
tion y(t),  taking  all  values,  such  that  every  function  fx^L  can  be  ex- 
pressed in  the  form. 

fx(t)  =  ^{2dx  +  V(0} 

where  dx  is  a  positive  or  zero  constant  depending  upon  X. 

If  now  A's  clock  is  regraduated  to  read  time  T  where  r  =  y(t),  the 
observable  function  fx(t)  is  transformed  into  the  function  T  +  2dx-  We 
have  thus  proved  that  : 

If  L  is  a  linear  system  of  particles  containing  A  ,  a  clock  reading  time  r 
can  always  be  assigned  to  A  so  that,  if  X  is  any  member  of  L,  the  ob- 
servable (A,X)(X,A)  relative  to  A  is  represented  by  the  function 
T  +  2dx  where  dx  is  a  positive  or  zero  constant  depending  upon  X.  Such 
a  clock  will  be  called  a  T-CLOCK  relative  to  L. 

It  also  follows  from  the  theorems  on  commutative  functions  that  A's 
r-clock  is  determined  uniquely  by  the  linear  system  except  for  an  arbitrary 
affinc  regraduation  r'  =  ar  +  b,  a  >  0.  The  only  effect  of  such  a  re- 
graduation  on  the  observable  functions  r  +  2dx  is  to  multiply  all  the 
constants  dx  by  the  same  factor  a. 

We  now  define  the  DISTANCE  d(At  X)  from  A  to  X  to  be  dx-  We 
observe  that  d(AtX)  >  0,  and  equality  occurs  when  and  only  when 
X  —  A.  In  terms  of  readings  on  A's  r-clock  the  distance  d(A,  X)  is  given 

by 


where  r  has  any  value  and  f  —  fx(r),  fx  =  (A,  X)(X,  A).  This  formula 
indicates  again  how  the  'scale'  of  distance  depends  upon  the  choice  of 
r-clock  ;  under  the  allowable  change  of  r-scale  given  by  T'  =  ar  +  b  we  get 

d'(A,  X)  =  i(f'  -  r')  =  Jfl(f  -  T)  =  ad(A,  X). 

When  a  r-clock  has  been  assigned  to  A  ,  a  clock  can  be  assigned  to  any 
other  particle  X  e  L  by  taking  the  clock  reading  at  the  event  (A  ,  X)  (T) 
to  be  T  +  dx,  and  it  can  easily  be  verified  that  this  parametrisation  of  X 
is  for  X  a  proper  r-clock  relative  to  L. 

DEFINITION.  The  r-clock  assigned  to  X  as  above  is  EQUIVALENT  to 
A's  r-clock. 

It  can  be  verified  that  for  all  particles  of  L  and  r-clocks  relative  to  L, 
equivalence  as  defined  here  is  reflexive  and  transitive.  Also,  for  any 


318  A.    G.    WALKER 

particles  X,  Y  of  L,  the  distance  d(X,  Y)  measured  in  relation  to  a  r-clock 
of  X  (relative  to  L)  is  equal  to  the  distance  d(Y,  X)  measured  in  relation 
to  the  equivalent  r-clock  of  Y,  and  for  any  three  particles  X,  Y,  Z  of  L, 
with  Y  between  X  and  Z, 

d(X,  Z)  =  d(X,  Y)  +  d(Y,  Z) 

where  distances  are  measured  in  relation  to  equivalent  r-clocks  relative 
toL. 

If  a  r-clock  undergoes  an  additive  regraduation  rf  =  r  +  b,  it  becomes 
a  r-clock  and  distances  measured  in  relation  to  it  are  unaltered.  However, 
if  one  of  two  equivalent  r-clocks  undergoes  this  regraduation  with  6^0, 
it  ceases  to  be  equivalent  to  the  other,  and  we  shall  say  that  the  clocks  are 
then  congruent. 

DEFINITION.  //  r-clocks  relative  to  L  are  attached  to  two  particles  of  a 
linear  system  L  and  are  equivalent  to  within  additive  regraduations ,  they 
are  CONGRUENT. 

From  the  properties  of  equivalence  it  follows  that  for  all  particles  of  L 
and  r-clocks  relative  to  L,  congruence  is  reflexive  and  transitive.  Also, 
the  distance  relations  d(X,  Y)  -  d(Y,  X),  d(X,  Z)  =  d(X,  Y)  +  d(Y,  Z) 
for  particles  of  L  hold  when  distances  are  measured  in  relation  to  con- 
gruent r-clocks. 

6.  We  now  wish  to  extend  this  idea  of  distance  from  a  linear  system  to 
the  whole  system  of  particles  and  so  establish  a  metric  on  the  'space'  of 
particles.  For  this  we  need  the  general  form  of  Axiom  IX  together  with  a 
new  axiom  of  symmetry. 

Consider  first  a  mapping  p  of  the  set  of  particles  onto  itself  which  leaves 
a  particle  A  invariant.  An  observable  /  relative  to  A  is  a  mapping  A  ->  A 
determined  in  some  way  by  a  sequence  of  particles  B,  C,  . . . ,  and  the 
transform  of  /  under  p  may  be  defined  as  the  observable  A  ->•  A  de- 
termined in  the  same  way  by  the  sequence  B',  C',  ...  where  B'  =  p(B), 
etc. 

DEFINITION.  An  ^-TRANSFORMATION  is  a  one-one  mapping  of  the  set 
of  particles  onto  itself  which  leaves  A  invariant  and  is  such  that  every 
observable  relative  to  A  is  transformed  into  itself. 

Since  the  property  of  collinearity  of  particles  can  be  defined  in  terms  of 
observables  relative  to  any  particle  A,  it  follows  that  a  linear  system  of 
particles  is  mapped  onto  a  linear  system  by  any  ^4-tranfsormation.  Also, 


AXIOMS  FOR  COSMOLOGY  319 

if  L  is  a  linear  system  containing  A  and  if  L  is  mapped  onto  L'  by  an  A- 
transformation,  a  r-clock  of  A  relative  to  L  is  also  a  r-clock  relative  to  L'. 
If  X,  Y  are  members  of  L  and  d(X,  Y)  the  distance  between  them  in 
relation  to  a  r-clock  of  A,  and  if  X',  Y'  are  the  images  of  X,  Y  under  the 
^4-transformation  and  d(X',  Y')  the  distance  between  them  in  relation  to 
the  same  r-clock  of  A,  then  d(X',  Y')  =  d(X,  Y). 

DEFINITION.  A  HALF-LINE  at  a  particle  A  is  part  of  a  linear  system 
containing  A  ;  it  consists  of  A  and  all  the  particles  on  one  side  of  A  . 

Thus  a  linear  system  containing  A  is  the  union  of  two  half-lines  at  A  . 
If  B  is  a  particle  distinct  from  A,  there  is  just  one  half-line  at  A  which 
contains  B. 

AXIOM  X.  //  A  is  any  particle  and  I,  m  any  two  half  -lines  at  A,  there  is 
an  A  -trans  formation  which  maps  I  onto  m. 

We  can  now  talk  of  any  observer  A  being  assigned  a  r-clock  without 
reference  to  any  particular  linear  system  containing  A  ;  a  r-clock  relative 
to  one  such  linear  system  will  also  be  a  r-clock  relative  to  any  other  be- 
cause of  Axiom  X  and  the  property  of  an  A  -transformation  mentioned 
above.  Thus  to  every  observer  can  be  assigned  a  r-clock  which  is  unique 
to  within  an  arbitrary  affine  regraduation. 

The  definition  of  congruence  of  r-clocks  given  in  §  5  applies  to  any  two 
observers,  and  we  can  now  prove  that  this  congruence  is  transitive,  i.e.  if 
to  any  three  observers,  A  ,  B,  C  are  assigned  r-clocks  such  that  those  of  A 
and  B  are  congruent  and  those  of  A  and  C  are  congruent,  then  those  of  B 
and  C  are  congruent.  To  prove  this  it  is  sufficient  to  prove  that  the 
distances  d(B,  C)  and  d(C,  B),  measured  in  relation  to  the  r-clocks 
assigned  to  B  and  C  respectively,  are  equal.  This  is  a  consequence  of  the 
formula  \(r  —  r)  for  distance  in  terms  of  r-clock  readings  and  the  de- 
finition of  congruent  r-clocks  applied  to  the  pairs  A,  B  and  A,  C,  for  we 
find  that,  for  any  number  T, 


d(B,  C)  =  \{(A  :A,C)(r)-  r}t    d(C,  B)  =  }2{(A  :  C,  B)(r)  -  r] 

and  these  are  equal  by  Axiom  IX. 

Thus  r-clocks  can  be  assigned  to  all  particles  so  that  they  are  congruent 
in  pairs,  and  if  a  particular  r-clock  is  assigned  to  one  particle,  say  0,  the 
congruent  r-clock  attached  to  any  other  particle  is  unique  to  within  an 
arbitrary  additive  regraduation  r'  =  r  +  b.  Such  a  regraduation  does  not 
affect  measurements  of  distance  and  hence  the  distance  d(X,  Y)  between 


320  A.    G.    WALKER 

any  two  particles  X,  Y  is  uniquely  determined.  If  O's  r-clock  undergoes  an 
affine  regraduation  r'  =  ar  +  b,  a  >  0,  then  all  distances  are  multiplied 
by  the  same  factor  a.  We  have  thus  defined  a  METRIC  on  the  set  of  par- 
ticles, and  it  is  easily  verified  that  all  the  fundamental  properties  of  a 
metric  are  satisfied.  For  example,  the  triangular  inequality 

d(A,C)  <d(A,B)  +d(B,C) 

is  mainly  a  consequence  of  Axiom  III,  for  we  have  in  terms  of  ^4's  r-clock 
and  for  any  number  r,  2d(B,  C)  =  f  —  r  where 

f=(A,  B)(B,  C)(C,  B)(A,  B)-i(r) 
Also, 

2d(A,B)  =  (A,B)(B,A)(f)  -f 
and  hence 

2d(A,  B)  +  2d(B,  C)  =  (A,  B)(B,  C)(C,  B)(B,  A)(r)  -  r. 

Since  by  Axiom  III,  (A,  B)(B,  C)(r)  >  (A,  C)(r)  =  rf  say,  and 
(C,  B)(B,  A)(r')  >  (C,  A)(r')  we  have  from  Axiom  IV 

2d(A,  B)  +  2d(B,  C)  >  (C,  B)(B,  A)(rf)  -r>(C,  A)(rf)  -  r 
i.e.         2d(A,  B)  +  2d(B,  C)  >  (A,  C)(C,  A)(r)  -r  =  2d(A,  C} 

as  required. 

The  structures  we  have  given  to  the  set  of  particles  is  not  merely  that 
of  a  metric  space ;  it  is  that  of  a  geodesic  metric  space  as  defined  by  Buse- 
rnann  [1],  for  it  is  easily  verified  that  the  linear  systems  of  particles  are 
geodesies  of  the  metric  space  and  have  the  properties  required  for  a 
geodesic  space.  We  have  thus  reached  our  second  objective. 

7.  By  axiom  X  the  geodesic  space  of  particles  is  symmetric  in  the 
sense  that  for  any  particle  A  and  two  half-geodesies  at  A,  there  is  an 
isometry  of  the  space  which  leaves  A  invariant  and  maps  one  half- 
geodesic  on  the  other.  From  what  is  already  known  about  geodesic 
spaces  it  would  not  be  difficult  to  select  further  axioms  to  ensure  that  the 
space  is  3-dimensional  hyperbolic  or  euclidean.  For  example,  we  could 
define  a  ROTATION  about  A  as  an  isometry  which  is  either  the  identity 
mapping  or  leaves  one  and  only  one  geodesic  through  A  point-wise  invari- 
ant; then  replace  '/I -transformation'  by  'rotation*  in  Axiom  X  and 
postulate  that  the  set  of  all  rotations  about  A  is  a  group. 


AXIOMS  FOR  COSMOLOGY  321 

The  final  task  in  the  derivation  of  the  space-time  model  is  to  establish  r 
as  a  'cosmic'  coordinate,  i.e.  to  show  that  all  the  particles  can  be  assigned 
r-clocks  which  are  not  merely  congruent  but  also  equivalent  to  each 
other.  We  then  have  the  product  structure  T  x  C  on  the  set  of  all  events, 
C  being  the  space  of  particles,  and  it  is  a  straightforward  matter  to  define 
a  metric  on  T  x  C,  determine  light  paths  (defined  in  terms  of  linear 
systems  and  signal  mappings)  and  so  complete  the  features  of  the  cos- 
mological  model. 

Equivalent  r-clocks  have  already  been  defined  and  it  is  an  open  question 
whether  the  transitivity  of  this  equivalence  for  all  particles  is  a  conse- 
quence of  the  axioms  already  given.  If  necessary  it  is  a  simple  matter  to 
find  an  additional  axiom  which  gives  the  property  of  transitivity.  For 
example,  if  A,  B,  C  are  any  three  particles  with  r-clocks  such  that  those 
of  A  and  B  are  equivalent  and  those  of  A  and  C  are  equivalent,  it  can  be 
verified  that  the  observable  functions  (A,  B)(B,C)(C,  A)(r)  and 
(A,  C)(C,  B)(B,  A)(r)  are  both  of  the  form  r  +  constant,  and  that  they 
are  the  same  function  if  and  only  if  the  r-clocks  of  B  and  C  are  equivalent. 
It  would  be  sufficient,  therefore  to  postulate: 

AXIOM  XL     If   A,    B,    C    are    any   particles, 

(A,  B)(Bt  C)(C,  A)  =  (A,  C)(C,  B)(B,  A). 

It  is  possible,  however,  that  this  is  a  consequence  of  the  previous 
axioms. 


Bibliography 

[1]    BUSEMANN,  H.,  The  Geometry  of  Geodesies.  Academic  Press  Inc.  New  York, 

1955,  X  +  422  pp. 

[2]    MILNE,  E.  A.,  Kinematic  Relativity.  Oxford  1948,  VII  -f  238  pp. 
[3]    WALKER,  A.  G.,  Commutative  functions,  I.  Quarterly  Journal  of  Mathematics 

vol.  17  (1946)  pp.  65-82. 
[4]    ,  Foundations  of  Relativity.  Proceedings  of  the  Royal  Society  of  Edinburgh. 

vol.  62  (1948)  pp.  319-335. 


Symposium  on  the  Axiomatic  Method 


AXIOMATIC  METHOD  AND  THEORY  OF  RELATIVITY 

EQUIVALENT  OBSERVERS  AND 
SPECIAL  PRINCIPLE  OF  RELATIVITY 

YOSHIO  UENO 

Hiroshima  University,  Hiroshima,  Japan 

1.  Axiomatization  of  Relativity  Theory.  Roughly,  speaking  there  are 
two  different  approaches  when  we  try  to  examine  the  foundation  of 
relativity  by  means  of  axiomatic  methods.  In  the  first  approach  one  tries 
to  axiomatize  the  theory  of  relativity  as  it  is  now.  According  to  the  second, 
one  does  not  necessarily  aim  at  deriving  the  present  theory.  Rather,  one 
investigates  various  possible  ways  of  axiomatizing  the  theory  of  relativity, 
in  the  hope  that  one  will  be  able  to  examine  prospective  forms  of  new 
theories. 

In  the  first  approach,  one  postulates  at  the  beginning  the  present 
relativity  theory  as  the  firmly  established  theory  and  asks  what  set  of 
axioms  is  equivalent  to  the  theory.  Most  of  the  works  clone  so  far  has 
taken  this  approach.  Certainly,  most  people  accept  general  relativity  as 
well  as  special  relativity  as  firmly  established  theories,  just  like  classical 
mechanics  and  electrodynamics. 

However,  one  needs  to  reinvestigate  some  of  the  fundamental  concepts 
of  relativity  such  as  space-time,  scale,  clock  and  equivalence  of  observers, 
although  they  are  now  regarded  as  completely  established  beyond  any 
doubt.  For  instance,  the  fact  that  the  so-called  clock  paradox  is  still 
discussed  today  indicates  that  there  remains  some  ambiguity  about  the 
definition  and  interpretation  of  an  observer  or  a  moving  clock. 

Furthermore,  we  know  some  examples  of  peculiar  structure  of  space- 
time  as  shown  by  Godel's  peculiar  cosmological  solution  [1]  and  also  by 
another  peculiar  solution  due  to  Nariai  [2].  We  cannot  reject  these  peculiar 
solutions  only  from  fundamental  principles  of  relativity.  This  may  be 
again  a  reason  for  reinvestigating  fundamental  principles  of  relativity. 
Of  course,  to  these  peculiar  solutions,  the  respective  authors  gave  physical 
interpretations  which  seem  reasonable.  However,  to  insure  the  validity  of 
such  interpretations,  we  will  have  to  understand  clearly  the  fundamental 
principles  of  general  relativity.  It  is  beyond  any  doubt  that  axiomatic 

322 


RELATIVITY   AND    EQUIVALENT   OBSERVERS  323 

methods  are  very  useful  for  the  study  of  this  kind.  I  will  not,  however,  go 
into  details  of  such  studies  here. 

Comparing  these  two  alternative  approaches,  we  may  say  that  while 
logical  formulation  is  the  central  problem  of  the  first,  heuristic  con- 
siderations play  the  main  part  in  the  second.  Namely,  according  to  the 
latter  viewpoint  the  main  subject  will  be  to  examine  in  what  forms  one 
can  formulate  the  fundamental  concepts  of  relativity. 

From  now  on  I  want  to  deal  with  the  second  approach  of  axiomatic 
formulations,  namely,  how  to  formulate  physical  principles  of  relativity. 
In  this  approach,  we  are  not  anticipating  the  reproduction  of  special  and 
general  relativity  in  their  present  form  and  content.  Rather,  my  main 
concern  will  be  how  one  can  possibly  change  their  content. 

Then,  what  would  be  the  fundamental  concept  that  I  should  examine 
first?  One  may  start  from  considering  the  relation  between  matter  and 
space-time.  Or  one  may  consider  first  observers  and  invariance  of  physical 
laws.  The  latter  was  the  main  subject  of  the  work  on  equivalent  ob- 
servers, which  I  did  with  Takeno  [3] ,  and  also  of  my  work  [4]  on  equivalent 
observers  in  special  relativity.  I  shall  deal  mainly  with  the  subject  of 
observers  and  their  equivalence.  Most  of  the  content  of  this  paper  is  from 
the  papers  I  just  mentioned. 

2.  Equivalent  Observers.  In  general  relativity,  matter  and  space-time 
are  specified  by  each  other,  and  this  is  one  of  the  basic  characteristics  of 
the  theory.  In  special  relativity  matter  does  not  affect  directly  the 
structure  of  space-time.  There,  the  space-time  is  independent  of  the 
presence  of  matter  and  is  an  external  element  which  defines  modes  of 
existence  of  physical  phenomena.  In  special  relativity,  such  modes  of 
existence  of  physical  phenomena  are  determined  in  reference  to  the  state 
of  an  observer. 

It  is  for  this  reason  that  we  brought  up  the  concept  of  observers  as  the 
starting  point  of  our  work.  We  considered  first  the  existence  of  an 
observer  and  discussed  its  kinematical  aspect.  Following  the  work  by 
Takeno  and  Ueno  [3],  I  will  explain  how  this  was  actually  done.  The  first 
postulate  we  made  was  the  existence  of  a  three  dimensional  space  frame 
and  a  one  dimensional  time  frame  for  an  arbitrary  observer.  We  ex- 
pressed the  postulate  in  the  following  way: 

PI.  Any  equivalent  observer  M  is  furnished  with  a  three-dimensional 
'space-frame  S  with  origin  M  and  a  one-dimensional  'time-frame'  T,  and 


324  YOSHIO   UENO 

can  give  one  and  only  one  set  of  space  coordinates  (x,  y,  z)  and  time  coordinate 
(t)  to  any  point  event  E  to  within  frame  transformation. 

Let  me  first  explain  what  is  meant  by  frame  transformation.  We 
regard  two  observers  relatively  at  rest  as  essentially  identical.  And  we 
call  frame  transformation  the  transformation  between  the  frames  of 
identical  observers,  that  is,  the  frames  relatively  at  rest  to  each  other  as 
well  as  such  transformations  of  the  time  axis  that  simply  change  the  scale 
of  the  time  frame,  namely,  regraduation. 

The  postulate  requires  that  an  observer  can  give  to  an  event  a  set  of 
four  real  numbers  representing  coordinates  (x,  y,  z,  t)  which  is  uniquely 
determined  to  within  frame  transformation. 

It  follows  that  there  exists  a  relation  between  the  coordinates  (x,  y,  z,  t) 
given  to  an  event  by  an  observer  and  the  coordinates  (x't  y',  z',  t')  given 
to  the  same  event  by  another  equivalent  observer  in  his  own  frame. 
The  relation  is 


0,     (»,  /  =  1,  2,  3,  4), 
(*i,  *2,*a,*4)  =(x,y,z,t). 

In  the  above  PI,  we  assumed  the  existence  of  a  three-dimensional  space- 
frame  and  a  one-dimensional  time-frame.  However,  it  is  not  necessarily 
required  that  the  two  frames  be  combined  to  form  a  four-dimensional  space- 
time.  In  this  sense,  this  postulate  may  not  be  relativistic.  Therefore, 
the  postulate  can  cover  both  relativistic  and  non-relativistic  theories. 
Namely,  the  postulate  is  not  characteristic  of  relativistic  theories.  In  fact, 
there  are  some  transformation  groups  for  which  we  can  find  no  four- 
dimensional  space-times  satisfying  the  postulate  of  equivalency. 

The  second  postulate  we  make  requires  that  any  observer  can  observe 
another  observer.  PI  permits  an  observer  to  assign  a  set  of  coordinates  to 
any  point  event,  but  it  does  not  necessarily  follow  from  this  that  the 
observer  can  do  the  same  to  another  observer.  The  second  postulate  is 
necessary  for  this  reason.  It  is  the  following. 

PI  I.  Any  observer  M  can  observe  all  other  equivalent  observers  and  they 
are  all  in  motion  relative  to  M. 

Questions  may  arise  as  to  what  is  meant  by  being  in  motion.  Here  as 
in  the  ordinary  case,  we  say  an  observer  is  in  motion  relative  to  M  if  the 
spatial  coordinates  of  the  observer,  (x,  y,  z),  are  changing  with  time  t. 

•The  third  postulate  is  a  very  important  one. 


RELATIVITY   AND   EQUIVALENT   OBSERVERS  325 

PHI.     The  group  of  frame  transformations  ©o  is  given  by  the  rotations 

#!  =  zdy  —  ydz,     R2  =  xdz  —  zdx,     7?3  =  ydx  —  xdy 
and  the  translations 

TI  =  dx,     T2  =  dy,     7"3  =  Bz 

of  the  space  frame  SM  of  M  and  the  translation  U  —  dt  of  time  frame  TM  of 
M.  And  Ms  ©o  together  with  the  set  of  transformations  given  by  (I)  forms  a 
continuous  group  of  transformations  ©. 

The  first  question  concerning  this  postulate  will  be  why  this  particular 
transformation  was  chosen  as  the  frame  transformation.  In  our  work  we 
use  coordinates  without  attaching  any  special  meaning  to  them.  Mathe- 
matically that  should  be  satisfactory.  However,  we  must  examine  the 
physical  meaning  of  coordinates  in  order  to  compare  the  theory  with  the 
actual  world  in  some  way,  or  to  apply  the  theory  to  observations  of 
phenomena. 

If  the  above  mentioned  frame  transformation  A^,  TI,  U  can  be  inter- 
preted as  expressing  the  isotropy  and  homogeneity  of  space  and  the 
stationary  character  of  time,  then  quite  naturally,  we  can  regard  (x,  y,  z) 
as  the  cartesian  coordinates  of  the  euclidean  space  and  t  as  the  coordinate 
of  time  flowing  uniformly.  Certainly  three-dimensional  Riemannian  space 
whose  fundamental  tensors  are  form  invariant  under  coordinate  trans- 
formations RI  and  TI  is  euclidean.  This  can  be  easily  confirmed.  In  our 
work  we  have  not  assumed  the  metrical  structure  of  space  and  time.  Here 
we  shall,  however,  postulate  tentatively  that  the  physical  world  forms 
four  dimensional  space-time.  There  may  exist  several  ways  to  determine 
the  structure  of  this  space-time.  Here  we  shall  take,  as  an  example,  the 
following  one  tentatively. 

Let  us  first  notice  that  the  following  postulate  we  shall  take  here 
completely  determines  the  structure  of  space-time  in  which  equivalent 
observers  can  exist,  and  also  the  scale  and  the  clock  of  that  space-time. 
Namely,  we  postulate  that  the  metric  ds  of  the  space-time  be  form 
invariant  under  the  group  ©  which  is  composed  of  the  frame  transfor- 
mation ©o  and  the  transformation  among  equivalent  observers  as  given 
by  eq.  (1).  That  is  to  say,  we  require  that  the  space-time  has  the  metric 
ds2  given  by 

ds*  =  gijdxtdx*  (»,/  =  1,2,3,4) 


with  gij  which  is  form  invariant  under  ©.  Then,  the  laws  of  nature,  if  they 


326  YOSHIO    UENO 

can  be  expressed  as  tensor  equations,  will  be  form  invariant  under  @j. 
Thus,  the  laws  of  nature  will  assume  the  same  expression  for  equivalent 
observers.  This  is  the  actual  meaning  of  the  equivalent  observers. 

We  should  also  notice  the  following.  Namely  us  will  be  shown  later  by 
an  example,  we  found  that  for  certain  @j's,  there  exists  no  four-dimensional 
space-time  of  the  nature  mentioned  just  now.  In  such  cases,  any  two 
observers  connected  by  03  in  any  four  dimensional  space-time  whatsoever, 
will  not  be  equivalent  in  the  above  sense.  In  such  cases,  we  may  take  a 
viewpoint  different  from  that  of  usual  relativistic  theories  and  say  that 
there  exists  no  four-dimensional  space-time.  How  to  interpret  such  an 
extraordinary  case  must  be  determined  in  each  case. 

Now  let  us  return  to  the  main  story.  The  next  postulate  is: 

PIV.  //  M  and  M'  are  any  two  equivalent  observers,  they  are  in  radial 
motion  with  respect  to  each  other,  and,  furthermore,  if  M  observes  any  E  on 
the  straight  line  MM'  ,  then  M'  also  observes  the  same  E  on  the  straight  line 
M'M,  independently  of  each  time  coordinate  t  and  t'  .  Here,  a  straight  line 
means  the  set  of  all  the  points  invariant  under  any  rotation  of  S. 

Implicit  in  this  postulate  is  an  assumption  that  we  can  treat  three- 
dimensional  space  in  analogy  with  one-dimensional  space.  Certainly  this 
assumption  will  be  natural.  However,  there  are  things  characteristic  of 
one-dimensional  space.  Therefore,  we  need  to  be  careful. 

Here  I  shall  only  mention  the  results  obtained  from  the  postulates  I 
discussed  so  far,  and  shall  not  explain  the  actual  calculations  we  did. 

We  found  that  the  transformations  between  equivalent  observers  thus 
obtained  were  classified  into  the  following  three  types.  They  are: 

(a)  Lorentz-type  transformation 

x'  =  (x  —  vt)/V\  —  av2,     y'  =  y,     z'  =  z,     t'  =  (t  —  avx)/V\  —  av2. 

(b)  Galilei  transformation. 

x'  =  x  —  vt,     y'  =  y,     z'  =  z,     t'  =  t. 

(c)  ^-transformation  (as  named  by  Takeno). 

%'  =  x  —  v  exp(fltf),     y'  =  y,    z'  =  z,     t'  =  t. 


It  is  very  interesting  that  we  obtained  Lorentz-type  transformation 
without  any  assumption  on  relative  motion  of  observers.  I  will  discuss 
this  point  later.  Here  I  shall  discuss  the  /C-transformation.  A  characteristic 
feature  of  this  transformation  is  that  a  point  at  rest  in  system  5'  moves 


RELATIVITY   AND   EQUIVALENT   OBSERVERS  327 

in  (ST)  system  with  the  velocity  proportional  to  the  distance  between  the 
two  origins  of  S  and  S'  systems.  Namely,  we  obtain  from  the  above 
equation 

[<*#/<#]  aj'=const.  =  «Ms'-0. 

This  relation  reminds  us  of  the  velocity  distance  relation  of  nebular 
motion.  If  we  choose  a  as  Hubble's  constant,  this  expression  can  be 
interpreted  as  the  Bubble's  relation  in  steady-state  theory  due  to  Bondi 
and  Gold  [5].  Assuming  that  we  regard  the  postulate  PHI  as  expressing 
the  isotropy  and  homogeneity  of  space  as  well  as  the  uniformity  of  time, 
it  may  be  interesting  to  consider  the  relation  between  the  assumption  of 
invariance  of  the  laws  of  nature  for  /f -transformation  and  the  perfect 
cosmological  principle  in  the  steady-state  theory  of  cosmology.  Thus  we 
may  say  that  PHI  satisfies  in  a  sense  the  conditions  required  by  the 
perfect  cosmological  principle.  In  other  words,  we  may  say  that  PHI 
expresses  the  essential  content  of  the  perfect  cosmological  principle. 
Furthermore,  there  arc  many  questions  concerning  the  /^-transformation 
like:  what  invariant  relations  do  we  have  under  this  transformation?  or 
what  kind  of  dynamics  corresponds  to  this  transformation  ?  We  are  now 
studying  the  applicability  of  the  transformation  to  cosmology. 

Lastly  we  shall  remark  on  some  problems  concerning  the  structure  of 
space  and  time.  An  especially  remarkable  feature  of  the  /^-transformation 
is  that  there  exists  no  four-dimensional  space-time  of  which  the  metric 
is  form  invariant  under  the  group  @J  comprising  the  /^-transformation. 
Hence,  it  is  not  proper  to  imagine  in  the  above  stated  sense  a  four- 
dimensional  space-time  as  the  background  in  which  we  consider  equivalent 
observers  connected  by  the  /^-transformation.  We,  therefore,  expect  that 
a  cosmology  completely  different  from  the  relativistic  one  will  come  out  if 
we  adopt  this  transformation. 

3.  Equivalent  Observers  in  Special  Relativity.  Now  I  want  to  change  my 
subject  to  the  work  I  did  on  equivalent  observers  in  special  relativity. 
The  main  problem  is  how  to  axiomatize  fundamental  principles  of  special 
relativity.  Let  us  consider  first  the  special  principle  of  relativity.  How  to 
express  this  principle  differs  somewhat  from  person  to  person.  Here,  I 
borrow  from  the  statement  by  Einstein  himself  [6]. 

//  K  is  an  inertial  system,  then  every  other  system  Kr,  which  moves  uni- 
formly and  without  rotation  to  K,  is  also  an  inertial  system:  the  laws  of 
nature  are  in  concordance  for  all  inertial  systems. 


328  YOSHIO   UENO 

The  principal  concepts  which  should  be  examined  in  this  principle  are 
the  following:  first,  inertial  system  and  uniform  motion;  then  what  is 
actually  meant  by  the  statement  that  the  laws  of  nature  are  in  concor- 
dance for  all  inertial  systems.  In  my  paper  [4],  I  discussed  mainly  this 
principle  and  did  not  touch  the  principle  of  constancy  of  light  velocity. 

Now  we  shall  try  to  axiomatize  the  special  principle  of  relativity.  At  the 
beginning  we  postulate  the  existence  of  observers,  space-frame  and  time- 
frame.  First,  we  make  the  same  postulate  as  PI  we  gave  before  in  Section 
2.  We  shall  call  it  AI  here. 

AI.  Any  equivalent  observer  M  is  furnished  with  a  three-dimensional 
'space-frame'  S  with  origin  M  and  a  one-dimensional  'time-frame'  T,  and 
can  give  one  and  only  one  set  of  space  coordinate  (x,  y,  z)  and  time  coordinate 
(t)  to  any  point  event  E  to  within  frame  transformation. 

By  this  postulate,  it  becomes  possible  to  correspond  a  set  of  space 
coordinate  (x,  y,  z)  and  time  coordinate  (t)  to  any  point  event.  The 
postulate  specifies  three  dimensionality  of  space  and  one  dimensionality 
of  time.  An  important  conclusion  of  relativity  tells  that  the  space  and 
time  cannot  be  separated  as  two  independent  objective  entities.  However, 
it  does  not  follow  from  this  conclusion  that  the  space  and  time  cannot  be 
separated  for  each  individual  observer.  Hence,  our  postulate  is  not  in 
contradiction  with  the  existence  of  the  space-time  in  relativistic  sense. 
From  AI  we  can  conclude  that  there  exist  different  observers  and  co- 
ordinate transformation  between  their  space  and  time  frames. 

Next  we  adopt  PII  stated  in  Section  2  and  call  it  All. 

All.  Any  observer  M  can  observe  all  other  equivalent  observers  and  they 
are  moving  relative  to  M. 

Thirdly,  we  postulate  the  existence  of  uniform  motion.  This  is  the 
central  point  of  the  theory. 

AIII.     There  exist  point  events  which  move  uniformly. 

Instead  of  postulating  the  existence  of  uniform  motions  as  done  here, 
we  could  have  postulated  the  existence  of  clock  and  scale  to  define  the 
structure  of  space  and  time,  and  could  have  obtained  the  same  result. 
However,  we  want  to  use  only  kinematical. concepts  at  the  beginning. 
Now  questions  arise  as  to  what  objects  make  uniform  motion  and  also  as 
to  how  one  can  recognize  uniform  motion.  The  answer  to  these  could  be 
given  by  introducing  dynamical  concepts.  For  instance,  one  could  define 


RELATIVITY   AND    EQUIVALENT   OBSERVERS  329 

uniform  motion  from  the  absence  of  external  forces.  However,  if  we  want 
to  proceed  following  this  line  of  thought,  dynamical  aspects  must  be 
postulated  first.  Here  we  shall  not,  however,  do  this.  The  actual  problem 
here  will  be  how  to  express  the  uniform  motion  in  the  space-and-time- 
frame.  Next  we  shall  consider  this  problem. 

By  AI  each  observer  was  given  a  space  frame  and  a  time  frame. 
However,  there  still  remained  the  degree  of  freedom  of  the  frame  transfor- 
mations. Using  this  freedom,  we  shall  choose  the  space  and  time  frames 
so  that  we  can  express  uniform  motion  in  a  simple  way. 

DEFINITION  1.  We  call  a  coordinate  system  a  NORMAL  FRAME  if  the 
coordinates  (x,  y,  z,  t)  of  a  'point  event  in  uniform  motion  satisfy  the  following 
relations  in  this  frame. 

(2)  x  --=  vxt  +  cx,     y  -=  vyt  +  cy,     z  =  vzt  +  cz. 

Here  v's  and  c's  arc  constants.  By  these  relations,  we  have  now  an  ex- 
pression for  uniform  motion.  Now  we  shall  consider  the  frames  which 
are  in  uniform  motion.  In  the  following,  we  shall  exclusively  deal  with 
normal  frames. 

DEFINITION  2.  //  any  point  at  rest  in  frames  (S'T')  of  an  observer  A/' 
has  always  the  coordinates  that  satisfy  the  relation  (2)  with  the  same  ^'s  in 
frame  (ST)  of  another  observer  M,  then  frame  (S'T')  is  IN  UNIFORM  MOTION 

RELATIVE  TO  (ST). 

The  existence  of  such  a  normal  frame  can  be  a  question.  That  is  to 
say,  we  are  given  the  uniform  motion  by  postulate,  but  it  is  not  guaranteed 
that  we  can  always  find  a  frame  in  which  we  can  express  the  uniform 
motion  by  equation  (2).  Hence,  we  shall  assume  the  existence  of  a  normal 
frame. 

AIV.     To  each  observer,  there  exists  a  normal  frame. 

The  next  axiom  is  a  keypoint  of  the  special  principle  of  relativity. 

AV.  Any  normal  frame  which  can  be  obtained  by  frame  transformation 
from  a  normal  frame  (ST)  or  any  normal  frame  which  is  moving  uniformly 
relative  to  (ST)  is  equivalent  to  (ST) . 

The  word  "equivalent"  used  in  the  above  AV  means  that  the  laws  of 
nature  are  in  concordance  for  the  frames  under  consideration.  We  shall 
postulate  the  following  set  of  axioms  for  equivalency.  These  hold  for  the 


330  YOSHIO   UENO 

usual  equality  relation.  Writing  A  =  B  to  express  that  A  is  equivalent  to 
B,  we  shall  postulate  the  following  relations: 

AVI.     Axiom  of  equivalence. 

(i)  A=A, 

(ii)  if  A  -B,  thenB  ==  A, 
(iii)  t/  A  ^  B  and  B  ^-  C,  /Aen  ,4  =  C. 


From  the  above  AVI  we  can  easily  derive  the  following  theorem. 

THEOREM  I.  Coordinate  transformations  between  equivalent  frames  form 
a  group. 

From  the  above  axioms  we  can  obtain  the  explicit  form  of  the  co- 
ordinate transformation  from  one  normal  frame  to  another.  It  is 

(3)  *'«  =  afx*  +  c*,     det(«;«)  ^  0,     (/,  /  =  1  ,  2,  3,  4). 

Here  as  and  c's  are  constants.  As  is  well  known,  these  transformations 
form  the  affine  group.  Therefore  we  obtain  the  following  theorem. 

THEOREM  2.  The  set  of  transformations  between  normal  frames  forms 
the  affine  group. 

Evidently,  this  group  includes  as  a  sub-group  the  group  of  frame  transfor- 
mations. 

If  we  further  want  to  derive  the  constancy  of  the  velocity  of  light,  we 
have  to  define  clock,  scale  or  the  metrical  structure  of  space  and  time. 
By  suitable  stipulation  of  these  concepts,  we  shall  obtain  the  Lorentz 
transformations. 

Before  proceeding  further,  I  want  to  come  back  to  the  problem  of  how 
to  define  uniform  motion.  The  linear  form  we  adopt  was  of  course  in 
direct  analogy  with  euclidean  space.  Of  course,  there  is  no  a  priori  reason 
for  euclidean  space.  However,  that  the  euclidean  space  is  plausible  may 
be  seen  as  follows.  In  order  to  discuss  the  structure  of  space  and  time,  we 
will  have  to  introduce  the  metric  of  the  space.  Let  us  assume  that  the 
metric  dl  of  (x,  y,  z)  space  is  given  by 

«fl2  =  gtfxidxf,     (i,  /  =  1  ,  2,  3),     (*i,  *2,  *3)  =  (x,  y,  z). 

It  will  be  quite  natural  to  assume  that  the  distance  dl,  which  a  point  in 
uniform  motion  travels  in  time  dt,  is  proportional  to  dt.  If  we  assume  this, 
then  gy  must  be  constant.  From  this  we  can  easily  prove  the  euclidean 


RELATIVITY    AND    EQUIVALENT   OBSERVERS  331 

property  of  the  space.  Then,  we  can  introduce  a  cartesian  coordinate 
system,  and  can  define  scale.  Clock  can  be  defined  by  combining  scale 
and  uniform  motion. 

In  pre-relativistic  theories,  it  is  postulated  that  the  running  rate  of  a 
clock  is  the  same  for  all  observers,  independently  of  their  state  of  motion. 
Namely,  the  existence  of  an  absolute  time  lapsing  objectively  is  assumed. 
We  do  not  make  such  an  assumption,  since  there  is  no  compelling  reason 
for  this.  The  running  rate  of  the  moving  clock  can  be  determined  by  (3), 
namely  by  its  state  of  motion  and  the  nature  of  scale  which  is  determined 
by  the  euclidean  nature  of  the  space. 

The  axiomatic  formulation  of  the  special  principle  of  relativity  has 
been  the  main  problem  of  the  foregoing  discussions.  Our  papers  were 
attempts  aimed  at  this  end.  Of  course,  we  did  not  aim  at  rigorous  axi- 
omatization  of  the  theory.  Our  interest  was  not  in  logical  exactness  but 
was  rather  in  knowing  how  to  express  the  content  of  the  special  principle  of 
relativity.  We  believe  that  any  attempt  to  axiomatizc  special  relativity 
should  start  from  analyzing  the  content  of  the  special  principle  of  rela- 
tivity in  all  possible  ways. 

Our  work  reveals  that  uniform  motion,  normal  frame  and  Minkowski 
space-time  are  cyclically  related  and  that  logically  there  is  no  reason  to 
give  priority  to  one  of  them.  Therefore,  either  to  assume  the  existence 
of  objects  which  undergo  uniform  motion  first,  or  to  assume  Minkowski 
space-time  first,  will  be  a  kind  of  tautology. 

If  we  want  simplicity  and  rigor  in  the  axiomatization  of  special  rela- 
tivity, then  the  existence  of  Minkowski  space-time  will  have  to  be  postu- 
lated first.  Or  to  postulate  the  constancy  of  light  velocity  first  instead 
of  doing  it  last  may  be  a  simpler  way  than  to  specify  the  nature  of  space- 
time  first.  Whichever  way  we  choose,  there  remains  a  number  of  problems 
to  be  considered  in  axiomatization  of  special  relativity.  Our  work  will 
serve  to  solve  one  of  these  problems ;  however,  our  work  has  the  following 
weak  point.  Namely,  the  weakest  point  of  our  paper  lies  in  not  drawing 
any  conclusion  about  how  to  specify  the  space-time  structure.  On  the 
other  hand,  because  of  this  deficiency  we  are  left  with  the  freedom  of 
choosing  a  space-time  structure.  This  is  the  next  problem  to  be  studied. 


332  YOSHIO    UENO 

Bibliography 

[1]    GODEL,  K.,  An  example  of  a  new  type  of  cosmological  solutions  of  Einstein's  field 

equations  of  gravitation.  Reviews  of  Modern  Physics,  vol.  21  (1949),  447-450. 

,  A  remark  about  the  relationship  between  relativity  and  idealistic  philosophy, 

in  SCHILPP,  P.  A.  (ed.)  Albert  Einstein :  Philosopher- Scientist.  New  York  1951, — 

pp.  555-562. 

GRUNBAUM,  A.,  Das  Zeitproblem.  Archiv  fiir  Philosophic,  vol.  7  (1957),  pp. 

165-208. 
[2]    NARIAI,   H.,   On  a  new  cosmological  solution  of  Einstein's  field  equations  of 

gravitation.  The  Science  Reports  of  the  T6hoku  University,  Scr.  I,  vol.  XXXV 

(1951),  pp.  62-67. 
[3]     UENO,  Y.  and  H.  TAKBNO,  On  equivalent  observers.  Progress  of  Theoretical 

Physics,  vol.  8  (1952),  pp.  291-301. 
[4]    ,  On  the  equivalency  for  observers  in  the  special  theory  of  relativity.  Progress 

of  Theoretical  Physics,  vol.  9  (1953),  pp.  74-84. 
[5]     BONDI,  H.,  Cosmology.  Cambiidge  1952,  146  pp. 
[6]     EINSTEIN,  A.,  The  meaning  of  relativity.  Princeton  1953,  25  pp. 


Symposium  on  the  Axiomatic  Method 


ON  THE  FOUNDATIONS  OF  QUANTUM  MECHANICS  1 

HERMAN  RUBIN 

University  of  Oregon,  Eugene,  Oregon,  U.S.A. 

\ .  We  shall  consider  several  formulations  of  the  foundations  of  quan- 
tum mechanics,  and  some  of  the  mathematical  problems  arising  from 
them.  Various  of  these  problems  will  be  treated  in  greater  or  less  detail. 

Most  of  the  results  presented  here  are  not  new,  and  it  is  the  purpose  of 
this  paper  mainly  to  bring  to  the  attention  of  the  worker  in  this  field 
some  of  the  difficulties  which  they  have  blithely  overlooked.  Most  of  the 
mathematicians  dealing  with  the  foundations  of  quantum  mechanics  have 
concerned  themselves  mainly  with  Hilbert  space  problems ;  one  point  they 
have  brought  out  is  the  distinction  between  pure  and  mixed  states.  We 
shall  not  concern  ourselves  here  with  this  problem,  but  shall  confine  our 
attention  to  pure  states. 

We  give  three  formulations  in  detail ;  A,  the  Hilbert  space  formulation 
with  unitary  transition  operators,  B,  the  matrix-transition-probability- 
amplitude  formulation,  and  C,  the  phase-space  formulation.  Each  of  these 
formulations  is  adequate  for  quantum  mechanics.  In  formulation  A  in  the 
classical  case,  the  problem  is  usually  specified  by  specification  of  the 
Hamiltonian  and  then  solved  by  means  of  the  Schrodinger  equation; 
Feynman  has  proposed  a  method  of  path  integrals  which  are  not,  as 
claimed,  the  average  over  a  stochastic  process,  and,  while  a  similarity  to 
stochastic  processes  exists  and  should  be  exploited,  does  not  mean  that 
theorems  and  methods  applicable  in  stochastic  processes  automatically 
apply.  The  same  remarks  apply  to  approach  B,  and  a  table  is  included  of 
some  important  differences  between  stochastic  and  quantum  processes. 
The  identifiability  problem  is  also  pointed  out  for  formulation  B. 

Formulation  C  is  formally  much  closer  to  stochastic  processes  than  A 
or  B,  but  important  differences  are  apparent.  First  and  most  important, 
the  joint  "density"  of  position  and  momentum  need  not  be  non-negative 
or  even  intcgrable.  This,  it  seems  to  the  author,  implies  that  not  only  are 
position  and  momentum  not  simultaneously  precisely  measurable,  but 

1  Research  partially  supported  by  an  OOR  contract.  Reproduction  in  whole  or  in 
part  is  permitted  for  any  purpose  of  the  United  States  government. 

333 


334  HERMAN    RUBIN 

that  they  are  not  even  simultaneously  measurable  at  all.  It  is  true  thai 
non-negativeness  of  the  density  is  preserved,  but  even  here  the  motion  i< 
not  that  of  a  stochastic  process. 

2.  Let  tff  be  a  Hilbert  space,  £P  a  partially  ordered  set  —  which  ir 
the  relativistic  case  could  be  thought  of  as  the  set  of  all  space-like  sur- 
faces, and  in  the  classical  case  all  points  of  time.  Suitable  conditions 
which  will  not  be  discussed  here  are  to  be  imposed  on  6f  . 

A.  For  all  S,  T  e  £f  y  S  <  T,  there  is  a  unitary  operator  UTS  on  Jtf 
such  that  if  R  <S  <  T, 

(1)  UTR  = 

In  the  classical  case 

(2) 

where  H  is  the  Hamiltonian,  and  the  Hilbert  space  may  be  taken  to  be  L$ 
over  a  Euclidean  space  of  suitable  dimensionality. 

A  central  problem  in  quantum  mechanics  is  specification  of  the  Hilbert 
space  and  unitary  operators  involved. 

Let  E  and  F  be  complete  spectral  decompositions  of  the  identity. 
Since  for  all  x  e  3%  \  %  =  fdEx  =  fdFx,  we  have  UTS  %  =  ffdFUTsdEx, 
integrated  first  over  E.  But  this  is  just  the  formulation  of  matrix  mecha- 
nics. Thus  if  suitable  regularity  conditions  are  satisfied, 

B.  For  all  R,  S,  T  e  &,  R  <  S  <  T,  D,  E,  F  complete  spectral  de- 
compositions of  the  identity, 

(3)  dFT  =  I  dArS(F,  E)dES) 
and 

(4)  dXTR(F,  D)  =fdlrs(F,  E)dlSR(E,  D). 

One  can  reconstruct  U  from  L 

If  the  spectral  decompositions  are  discrete,  the  integration  becomes  a 
summation.  Also,  we  have  the  following  interpretation  of  A:  the  proba- 
bility that  an  observation  at  "time"  T  will  yield  a  result  in  a  set  2£  given 
that  an  observation  at  "time"  5  yields  a  result  E  is 

(5) 


This  has  been  interpreted  as  analogous  to  a  stochastic  process.  However, 


ON  THE  FOUNDATIONS  OF  QUANTUM  MECHANICS      335 

the  differences  are  quite  apparent  to  one  familiar  with  stochastic  pro- 
cesses, and  are  important.  For  a  stochastic  process,  the  analogues  of  (3) 
and  (4)  are  customarily  taken  as  definitions.  However,  expression  (5)  is 
replaced  by 

(6)  ffas(F,E)dF. 

3T 

The  analogue  of  approach  A  is  not  as  immediate.  ^  is  to  be  replaced 
by  an  L\  space  over  a  finite  measure  space,  wrhich  can  be  abstractly 
characterized.  Then  UTS  becomes  a  positive  linear  operator  on  3?  to  3? 
and  (1)  is  satisfied.  In  addition,  for  some  strictly  positive  function  /i, 
and  all  5  and  T,  UTS/I  =  fi-  Also  we  may  frequently,  but  not  always,  in 
the  stationary  classical  case,  write 

(7)  Urs  =  exp[(r-S)n 

where  V  is  called  the  infinitesimal  generator  of  the  semigroup  U. 

To  see  the  differences  clearly,  let  us  consider  the  classical  case  where  the 
Hilbert  space  is  /2,  i.e.,  all  sequences  of  real  numbers  with  finite  sums  of 
squares.  Complex  Hilbert  space  seems  natural  in  quantum  mechanics, 
but  since  every  Hilbert  space  is  automatically  a  real  Hilbert  space,  and 
the  analogy  is  better,  wre  could  use  the  real  case.  However,  the  complex 
case  actually  provides  a  closer  analogy  to  a  real  stochastic  process!  If  we 
now  take  E  —  F  to  be  the  natural  decomposition  of  /2,  we  may  make  the 
following  analogy  with  discrete-space  stochastic  process.  Starred  sections 
refer  only  to  stationary  processes  with  linear  "time". 

Stochastic  process  Quantum  mechanics 

Markov  matrix  UTS  Unitary  matrix  UTS 

Transition  probability  UTSU  Transition  probability  \UTSij\2 

*  Infinitesimal  generator  does  not  *  Infinitesimal  generator  always  ex- 

always    exist    and   is   not    always  ists  and  is  unique, 
unique. 

*In  the  regular  case,  the  infinitesi-  *  Infinitesimal  generator  is  a  skew 

mal  generator  has  all  row  sums  0,  Hermitian  matrix, 
and  all  nondiagonal  elements  non- 
negative. 

Ordering  of  £f  irreversible.  Ordering  of  <5^  reversible. 

*Trivial  if  periodic.  *Can  be  non-trivial  and  periodic. 

From  A,  if  the  Hilbert  space  is  explicitly  an  L%  space,  it  may  be 


336  HERMAN    RUBIN 

possible  to  write  for  a  dense  set  of  functions 

(8)  UTS(x)=fKTS(u,v)x(v)dv, 

where  KTS  is  a  unitary  kernel.  It  may  be  possible,  and  indeed  in  the 
classical  case  it  is,  to  determine  the  T-derivative  of  K  at  T  =  5.  Suppose 
KTS*  is  a  unitary  approximation  to  KTS,  such  that  the  jT-derivatives  of 
K  and  K*  coincide  at  T  =  S.  In  the  classical  case,  Feyman  did  this  by 
writing 


(9)  KTS*(u,  v)  =  N(T  -  S)exp  ^—  ATS(u,  v 

where  ATS(M>  v)  is  the  action  along  the  classical  path  from  v  at  "time"  S 
to  u  at  "time"  T.  Then  we  may  define  UTS*  from  KTs*  in  a  manner 
analogous  to  (8).  It  may  be  that 

n 

UTS  =  Hm  p  UTiTt_lt*  T0  =  S,  Tn  =  T,  TVi  <  Tit 

when  the  partition  becomes  fine.  Although  there  are  several  treatments 
in  the  literature,  including  some  by  prominent  mathematicians,  the 
existence  and  value  of  this  limit  has  not  been  proved.  From  the  Schro- 
dinger  equation,  one  can  prove  the  following 

THEOREM:  //  there  exists  a  basis  of  L%  such  that  for  each  function  x  in 
the  basis,  the  second  derivatives  of  UTS%  has  a  uniformly  integrable  Fourier 

n 

transform,  then  UTS  —  lim  YI  ^T,T<-I>*  wneYe  TQ  =  S,  Tn  —  T,  T^-i  <  T^, 
and  the  partition  becomes  fine. 

It  seems  likely  that  this  result  can  be  considerably  extended. 

If  we  examine  the  analytic  form  of  (9) ,  we  find  that  it  resembles  that  of 
a  diffusion  process.  However,  the  "variance"  of  the  "diffusion  process" 
would  have  to  be  purely  imaginary.  Furthermore,  there  are  even  periodic 
models  in  quantum  mechanics  which  satisfy  the  theorem  above.  If  T—S 
is  a  multiple  of  the  period,  KTS  cannot  be  a  function  in  the  ordinary 
sense.  In  fact,  if  T—S  is  a  multiple  of  any  discrete  spectral  value,  this 
difficulty  arises. 

Another  difficulty  with  this  formulation  is  the  statement  that  in  the 
limit  KTS  is  the  normalized  mean  value  on  x  of  exp(A(u,  v,  x))  where  x  is 
a  path  with  end  points  v  at  S  and  u  at  T.  In  the  case  of  a  diffusion  process, 
it  is  well  known  that  the  corresponding  exponent  is  infinite  with  proba- 


ON  THE  FOUNDATIONS  OF  QUANTUM  MECHANICS  337 

bility  one.  The  same  difficulty  has  already  been  noted  in  the  quantum- 
mechanical  formulation. 

The  computation  of  the  Feynman  expression  also  is  rather  difficult 
to  evaluate.  However,  stochastic  process  methods  may  be  useful.  While 
the  process  has  purely  imaginary  variance,  we  may  compute  the  diffusion 
process  with  real  variance  and  use  analytic  continuation.  Again,  it  re- 
mains to  be  proved  that  this  method  is  correct.  An  intermediate  approach 
would  be  to  apply  analytic  continuation  to  the  coefficient  of  the  kinetic 
energy  term  alone.  This  last  method  has  worked  for  the  free  particle 
and  the  harmonic  oscillator,  and  methods  for  computing  the  results  in 
general  have  been  given  by  Kac. 

One  merit  of  the  Feynman  approach  is  that  it  has  great  possibility  of 
generalization  in  that  it  leads  to  a  specific  result  for  UTS,  the  specification 
of  which  is  a  main  problem  of  quantum  mechanics  and  usually  over- 
looked by  mathematicians  dealing  with  the  subject. 

There  is  an  outstanding  question  which  arises  from  the  empirical 
standpoint;  namely,  if  the  model  is  correct,  how  much  of  the  model  can 
be  determined  by  even  an  infinite  number  of  observations?  This  seems  to 
be  most  clearly  brought  out  in  formulation  B  above.  For  simplicity,  let 
us  assume  that  the  decompositions  E  and  F  are  discrete.  Then  the 
observable  quantities  are  \A.TSij\2-  Clearly  these  are  not  always  adequate 
for  fixed  E  and  F  even  if  5  and  T  are  arbitrary. 

In  the  discrete  case,  tosij  —  (UTS/I,  ?j)-  If  we  may  vary  E  arbitratily, 
we  may  determine  UTS/I  completely  apart  from  a  constant  of  absolute 
value  1  for  each  i.  If  furthermore  E  —  F  and  for  almost  all  5,  T^UTSIJ^^ 
for  all  /  and  /,  we  can  determine  UTSW  apart  from  a  constant  of  absolute 
value  /  independent  of  i  and  /,  i.e.,  apart  from  a  gauge  transformation. 

Another  approach  is  the  statistical  approach  of  Moyal.  This  approach, 
originally  due  to  Wigner,  is  to  investigate  the  joint  "distribution"  of 
position  and  momentum.  First,  suppose  a  finite  number  A\,  .  .  .,  An  of 
Hermitian  operators  are  given.  Then  if  they  have  a  joint  distribution,  its 
characteristic  function  is  £(exp  2  itjAj).  However,  the  operator  inside 
the  expectation  is  a  unitary  operator,  and  consequently  the  expectation 
in  question  exists. 

Therefore  we  should  be  able  to  determine  the  distribution  from  the 
expectation.  For  example,  let  A\,  A^  and  A3  be  the  spin  operators  for 
an  electron  in  a  hydrogen  atom  about  which  nothing  has  been  deduced  by 


experimentation   about    the  spin.  Then  £(exp  ^itjAj)  =  cos  -- 

£ 


338  HERMAN   RUBIN 

which  is  certainly  not  the  characteristic  function  of  any  distribution.  Let 
us  proceed  as  if  this  difficulty  does  not  arise,  and  let  us  treat  the  case  of 
position  and  momentum.  We  obtain  the  characteristic  function 


(10)  £(exp  (*«p  +  ifiq))  =/>*(?-  &*)tf*y(q  + 

and  the  corresponding  density 

(1  1)  t(t>,  9)=~ 


Another  example  of  the  misbehavior  of  /  is  in  order.  Let  us  consider  a 

plane  wave  passing  through  a  slit  of  operture  2a.  Then  y(x)  =  ------  , 

—  a  <  %  <  a,  and  we  obtain  v    a 


(12)  f(p,  q)  = 


1  2(a  —  \q\\p 

-sin-v-  -^UP        \q\<a, 


2nap  Ti 

0  \q\  >  a. 


We  clearly  see  that  /  is  not  non-negative,  and  not  even  Lebesgue  in- 
tegrable. 

It  would  be  desirable  to  have  an  abstract  characterization  of  all 
permissible  "densities",  as  the  density  is  adequate  both  for  the  kinematics 
and  for  the  dynamics  of  quantum  mechanics.  Let  us  proceed  to  do  so.  As 
to  the  kinematics,  it  follows  from  (11)  that  for  almost  all  x,  y, 


(13)  y(*)v*(.v)  =  jf(p,  ---J-- 


Therefore 


04)         [  I  /  (p,  X  +  y  )  I  (n, 


=   (  J  /  (p,  -——\  f  (n,  —  —  )  eM*-»W-yWh  dpdn 


for  almost  all  x,  y,  z,  w.  If,  in  addition,  //(/>,  x)dp  is  a  probability  density, 
there  will  be  a  unique  solution  for  y)  apart  from  a  factor  of  absolute 
value  1.  Conversely,  if  /  satisfies  (14)  and  ff(p,  x)dp  is  a  probability 
density,  the  ^  deduced  from  /  by  (13)  yields  /  in  return. 

Concerning  the  dynamics  of  the  process,  Moyal  has  shown  that  the 


ON  THE  FOUNDATIONS  OF  QUANTUM  MECHANICS      339 

temporal  derivative  of  the  characteristic  function  (10)  is,  where  H  is  the 
classical  Hamiltonian, 


(15)  — 

—  H(p  —  ±hp,  q 
Inverting  this,  we  obtain  for  the  derivative  of  the  density 

(16)  -^M 


J  J  W*-**'*6  (-  f  (?-?')-  -|  (P-P'))f(P',  q', 


where  «/(w  +  iv)  =  v,  and  H  denotes  the  Fourier  transform  of  H.  A  more 
convenient  form  of  (16)  is 


(17) 


dt 

i     r  c        . 

—  Jfta,  q  + 


J  J 


(]8)  ^M  =  |sin|.M     ' _ _JL  _L ]w, 


Even  this  form  gives  some  difficulties  in  evaluation  because  of  the  non- 
existence  in  the  usual  sense  of  ft,  and  the  right-hand  side  of  (16)  has  to 
be  evaluated  by  approximations.  The  form  which  Moyal  seems  to  prefer 
is  even  worse  in  this  respect,  but  it  also  has  some  advantages. 

M). 

[This  latter  form  shows  more  clearly  the  relationship  between  classical 
and  quantum  mechanics,  but  the  differential  operator  on  the  right  is  of 
infinite  order  and  analytic  difficulties  may  clearly  ensue.  In  the  case  in 
which  H  is  a  polynomial  of  degree  at  most  2,  (18)  reduces  to  the  classical 
equations  of  motion;  quantum-mechanical  considerations  come  in  only 
through  restrictions  (14)  on  /.] 

In  any  case,  it  follows  that  the  dynamics  of  the  phase-space  repre- 
sentation above  does  not  further  involve  the  wave  function.  Consequently, 
the  dynamics  of  y  is  determined  up  to  a  gauge  transformation  by  equa- 
tion (17),  and  hence  the  following  formulation  is  adequate  for  classical 
one-dimensional  quantum  mechanics : 


340  HERMAN   RUBIN 

C.  There  is  a  function  f  of  three  arguments  satisfying  almost  everywhere 
for  some  value  t  of  its  third  argument,  (14)  and  f f(p,  x,  t}dp  is  a  probability 
density,  and  satisfying  (17). 

It  is  clear  how  to  extend  this  to  higher  dimensional  cases. 

This  "probabilistic"  procedure  might  also  be  used  to  construct  the 
unitary  kernel  KST  for  the  Feynman  approach,  although  this  has  not 
been  done. 


Bibliography 

[1]     FEYNMAN,  R.  P.,  Space-time  approach  to  nonrelativistic  quantum  mechanics. 

Review  of  Modern  Physics,  vol.  20  (1948),  p.  367. 
[2]    GELFAND,   I.  M.  and  A.  M.  YAGLOM.   Integration  in  function  spaces  and  its 

application  to  quatum  physics.  Uspekhi  Matematicheskikh  Nauk  (N.S.),  vol. 

11  (1956),  p.  77. 
[3]     KAC,  M.,  On  some  connections  between  probability  theory  and  differential  and 

integral  equations.  Proceedings  of  the  Second  Berkeley  Symposium  on  Mathe- 
matical Statistics  and  Probability,  University  of  California,  Berkeley  1951. 
[4]     MONTKOLL,   E.   W.,    Markoff  chains,    Wiener   integrals,   and  quantum  theory. 

Communications  on  Pure  and  Applied  Mathematics,  vol.  5  (1952),  p.  415. 
[5]    MORETTE,  C.,  On  the  definition  and  approximation  of  Feynman's  path  integrals. 

Physical  Review,  vol.  81  (1951),  p.  848. 
[6]    MOYAL,  J.  E.,  Quantum  mechanics  as  a  statistical  theory.  Proceedings  of  the 

Cambridge  Philosophical  Society,  vol.  45  (1949),  p.  99. 
[7]    SEGAL,  I.  E.,  Postulates  for  general  quantum  mechanics.  Annals  of  Mathematics 

(2),  vol.  48  (1947),  p.  930. 
[8]    STONE,  M.  H.,  Notes  on  integration  /,  77,  ///,  IV.  Proceedings  of  the  National 

Academy  of  Sciences,  U.S.A.,  vol.  34  (1948),  p.  336,  p.  447,  p.  483,  vol.  35 

(1949),  p.  50. 


Symposium  on  the  Axiomatic  Method 


THE  MATHEMATICAL  MEANING  OF  OPERATIONALISM 
IN  QUANTUM  MECHANICS 

I.  E.  SEGAL 

University  of  Chicago,  Chicago,  Illinois,  U.S.A. 

1 .  Introduction.  An  operational  treatment  may  be  described  as  one  that 
deals  exclusively  with  observables;  but  the  latter  term  is  physically  as 
well  as  mathematically  somewhat  ambiguous.  Our  aim  here  is  to  circum- 
scribe this  ambiguity  by  axioms  for  the  observables  that  will  be  satis- 
factory as  far  as  they  go,  but  by  no  means  categorical.  On  the  other  hand, 
it  will  turn  out  that  it  is  not  too  far  from  such  axioms  to  plans  for  a 
categorical  model  representing  the  field  of  all  elementary  particles. 

The  need  to  consider  so  broad  a  system  arises  in  several  ways.  For  one 
thing,  no  axiom  system  is  secure  if  it  does  not  treat  a  closed  system,  and 
except  substantially  in  the  case  of  classical  quantum  mechanics  (by  which 
we  mean  the  non-relativistic  quantum  mechanics  of  a  finite  number  of 
degrees  of  freedom),  there  is  no  mathematical  or  physical  assurance  that 
the  systems  conventionally  considered  are  really  closed.  In  fact  the 
evidence,  —  highly  inconclusive  as  it  may  be,  —  points  very  much  in  the 
other  direction.  For  another,  although  the  mathematical  foundations  of 
classical  quantum  mechanics  are  in  a  relatively  satisfactory  state  from  at 
least  a  technical  point  of  view  (the  theory  is  consistent,  within  obvious 
limits  categorical,  and  realistic),  time  and  energy  play  crucial  but  puzzling 
roles,  as  observables  unlike  the  others.  While  this  remains  true  in  rela- 
tivistic  quantum  field  theory,  for  different  reasons,  it  seems  fair  to  say 
that  one  of  the  accepted  informal  axioms  of  the  theory  is  that  it  must 
ultimately  contain  the  solution  to  the  puzzle,  if  such  exists. 

We  should  not  gloss  over  the  question  of  just  what  is  a  quantum  field 
theory,  —  in  fact,  this  is  the  main  question  we  wish  to  examine  here.  It  is 
a  difficult  question,  since  at  present  what  we  have,  after  thirty  years  of 
intensive  effort,  is  a  collection  of  partially  heuristic  technical  develop- 
ments in  search  of  a  theory;  but  it  is  a  natural  one  to  examine  axioma- 
matically.  Present  practice  is  largely  implicitly  axiomatic,  and  nothing 

341 


342  I.    E.    SEGAL 

resembling  a  mathematically  viable  explicit  constructive  approach  has 
yet  been  developed.  In  any  event  a  constructive  approach  must  pre- 
sumable describe  the  physical  particles  with  which  an  operational  theory 
must  deal  in  terms  of  the  only  remotely  operational  bare  particles,  a 
problem  that  is  relatively  involved  in  the  current  non-rigorous  treatments, 
and  needs  to  be  clarified  by  a  suitable  axiomatic  formulation. 

Description  of  a  field,  whether  classical  or  quantum,  involves  analyti- 
cally three  elements:  (a)  its  phenomenology,  i.e.  the  statement  of  what 
mathematically  are  the  observables  of  the  field,  and  what  are  their 
physical  interpretations,  —  including  especially,  in  the  case  of  quantum 
fields,  the  statistics,  i.e.  the  observables  called  single-particle  occupation 
numbers,  which  do  not  exist  in  classical  fields,  and  form  the  basis  for  the 
particle  interpretation  of  quantum  fields;  (b)  its  kinematics,  i.e.  the 
transformation  properties  of  the  field  observables  under  the  fundamental 
symmetry  group  of  the  system;  (c)  its  dynamics,  or  the  'temporal'  de- 
velopment of  the  field,  where  however  the  'dynamical  time'  involved 
must  be  distinguished  from  the  'kinematical  time'  involved  in  (b).  The 
dynamics  results  from  the  interaction  between  the  particles  constituting 
the  field,  and  is  in  fact  its  only  observable  manifestation,  while  the  ki- 
nematics has  nothing  to  do  with  this  interaction. 

The  present  state  of  the  axiomatics  of  these  elements  and  of  the 
desiderata  relevant  for  further  developments  is  discussed  from  a  jointly 
mathematical  and  operational  viewpoint  in  the  following. 

2.  Phenomenology.  This  is  the  best-developed  of  the  relevant  phases 
of  quantum  mechanics  from  both  a  mathematical  and  an  operational 
point  of  view.  One  knows  that  the  bounded  observables,  which  are  the 
only  ones  that  can  in  principle  be  measured  directly,  form  a  variety  of 
algebra,  of  which  the  self-adjoint  elements  of  a  uniformly  closed  self- 
adjoint  algebra  of  operators  on  a  Hilbert  space  (C*-algebra)  is  virtually 
the  exclusive  practical  prototype.  One  knows  also  that  the  states  of  the 
system  are  represented  by  normalized  positive  linear  functional  on  the 
algebra,  the  value  of  such  a  functional  on  an  element  being  what  is 
conventionally  called  the  'expectation  value  of  the  observable  in  the 
state'  in  physics,  but  there  being  no  operational  distinction  between  the 
state  and  the  associated  functional,  —  i.e.  operationally  (and  in  our  usage 
in  the  following)  a  state  is  precisely  such  a  functional.  In  these  terms  the 


OPERATIONALISM   IN   QUANTUM   MECHANICS  343 

essential  notions  of  pure  state,  spectral  value  of  an  observable,  probability 
distribution  of  an  observable  in  a  state,  etc.,  can  be  axiomatized  and  shown 
to  admit  a  mathematical  development  adequate  for  physical  needs. 

An  important  conclusion  of  the  theory  is  that  a  physical  system  is 
completely  specified  operationally  by  giving  the  abstract  algebra  formed 
by  the  bounded  observables  of  the  system,  i.e.  the  rules  for  forming  linear 
combinations  of  and  squaring  observables.  In  particular,  operationally 
isomorphic  algebras  of  observables  that  are  represented  by  concrete  C*- 
algebras  on  Hilbert  spaces,  do  not  at  all  need  to  be  unitarily  equivalent, 
even  when,  for  example,  they  are  both  irreducible.  The  irrelevant  and 
impractical  requirement  of  unitary  equivalence  is  in  fact  the  origin  of 
serious  difficulties  in  the  development  of  quantum  field  theory,  a  point 
with  which  we  shall  deal  more  explicitly  later. 

The  subsumption  of  quantum  fields  under  general  phenomenology 
involves  the  formulation  and  treatment  of  the  'canonical  field  variables' 
and  the  'occupation  numbers'.  Traditionally  the  former  were  an  ordered 
set  of  symbols  pi,  p2,  •  •  •  and  q\9  q%,  ...  satisfying  the  commutation 
relations  that  had  been  so  successful  in  classical  quantum  mechanics. 
(This  is  for  'Bose- Einstein'  fields;  relevant  also  are  'Fermi-Dirac'  fields, 
but  as  these  involve  no  great  essential  novelty  as  far  as  the  present 
aspects  of  axiomatics  go,  the  present  article  treats  only  the  Bose-Einstein 
case.)  It  was  assumed  that  these  were  an  irreducible  set  of  self-adjoint 
operators,  and  that  any  two  such  systems  were  equivalent;  upon  this 
informal  axiomatic  basis  the  theory  rested.  But  from  the  very  beginning 
the  success  of  quantum  field  theory  was  attented  by  'infinities'  in  even 
the  simplest  cases,  and  more  recently  it  has  been  found  that  there  exist 
at  least  continuum  many  inequivalent  irreducible  systems  of  canonical 
variables.  Such  troubles  made  it  uncertain  whether  the  phenomenological 
structure  described  above  was  strictly  applicable  in  the  case  of  quantum 
fields,  or  at  least  whether  the  canonical  variables  really  were  self-adjoint 
operators  in  a  Hilbert  space.  The  proper  sophistication,  based  on  a 
mixture  of  operational  and  mathematical  considerations,  gives  however  a 
unique  and  transparent  formulation  within  the  framework  of  the  phe- 
nomenology described;  the  canonical  variables  are  fundamentally 
elements  in  an  abstract  algebra  of  observables,  and  it  is  only  relative  to  a 
particular  state  of  this  algebra  that  they  become  operators  in  Hilbert 
space. 


344  I.    E.    SEGAL 

In  a  formal  way  it  was  easily  seen  that  the  symbolic  operator 
(pk  +  iqk}(pk  —  iqk)  had  integral  proper  values  (i2  =  —  1),  and  for  this 
and  related  reasons  could  be  interpreted  as  'the  number  of  particles  in  the 
field  in  the  &th  state',  which  is  essentially  what  puts  the  'quantum'  into 
'quantum  field  theory',  by  giving  it  a  particle  interpretation.  Those 
particles,  the  'quanta'  of  the  field,  have  generally  been  presumed  to  be 
'represented'  by  the  vectors  in  a  linear  space,  proportional  vectors  being 
identified.  This  linear  space  L  does  not  have  direct  operational  sig- 
nificance, since  what  is  more-or-less  directly  observed  are  the  'occupation 
numbers  of  single-particle  states',  i.e.  the  observables  just  defined 
(formally).  But  the  general  principle  that  there  exists  (theoretically)  a 
single-particle  space  L,  spanned  by  an  infinite  set  of  vectors  /i,  /2,  . . . , 
and  such  that  pk  +  iqk  can  represent  in  a  certain  sense  the  creation  of  a 
particle  with  'wave  function'  ejc,  and  the  operator  defined  above  the 
total  number  of  such  particles  in  the  field,  has  attained  virtually  as  well- 
established  a  position  as  the  general  phenomenological  principles  de- 
scribed earlier.  The  great  empirical  success  of  relativistic  quantum  electro- 
dynamics, in  which  the  photon  and  the  electron  are  represented  by 
suitably  normalizable  solutions  of  Maxwell's  and  Dirac's  equation, 
respectively,  provides,  among  other  developments  a  basis  for  this  princi- 
ple, and  indicates  also  that  L  should  admit  a  distinguished  positive- 
definite  Hermitian  form,  which  determines,  e.g.,  when  two  particles  are 
empirically  similar.  It  is  conservative  as  well  as  useful  in  treating  certain 
theories  of  recent  origin  to  assume  only  a  distinguished  topological 
structure  that  may  be  induced  by  such  a  form,  which  turns  out  to  involve 
no  really  significant  weakening  of  the  foundations,  and  ultimately  to 
clarify  their  logical  structure.  In  fact,  partly  for  logico-mathematical 
reasons,  and  partly  with  a  view  to  deriving  ultimately  the  relevance  of 
complex  scalars  for  the  single-particle  space  from  invariance  under  so- 
called  particle-anti-particle  conjugation,  it  is  appropriate  to  assume 
initially  that  the  single-particle  structure  is  given  by  an  ordered  pair  of 
mutually  dual,  real-linear  spaces  with  the  topological  structure  described, 
and  with  which  the  canonical  £'s  and  q's  aae  respectively  associated.  A 
distinguished  admissible  positive-definite  inner  product  in  one  of  these 
spaces  will  give  a  distinguished  complex  Hilbert  space  structure  on  the 
direct  sum  of  the  two  spaces,  but  there  are  other  ways  in  which  this  more 
conventional  structure  may  arise. 

Taking  then  a  conservative  position,  and  defining  a  phenomenological 
single-particle  structure  as  an  ordered  pair  of  real-linear  spaces  (H,  H'} 


OPERATIONALISM   IN   QUANTUM    MECHANICS  345 

that  are  mutually  dual  in  the  sense  that  there  is  given  a  distinguished 
non-singular  bilinear  form  x.y'(x  e  H,  y'  e  H'),  a  quantum  field  relative 
to  this  structure  may  be  rigorously,  but  provisionally,  described  as  an 
ordered  pair  of  maps  (p(.),  q(.))  from  H  and  H'  respectively  to  the  self- 
adjoint  operators  on  a  complex  Hilbert  space  K,  satisfying  the  'Weyl 
relations' : 

etp(x)eip(y)  =  eip(x+y)f      eiq(x')eiq(y')  _  eiq(x'+y') 
eip(x)eiq(y')  =  eiX'y'eiq(y')etp(x)f 

which  are  formally  equivalent  to  the  conventional  commutation  re- 
lations, but  mathematically  more  viable,  in  that  difficulties  associated 
with  unbounded  operators  such  as  the  p's  and  ^'s  themselves,  are  avoided. 
This  is  merely  an  honest,  if  slightly  sophisticated  and  general,  mathe- 
matical transcription  from  the  ideas  and  practice  of  physical  field  theory, 
but  it  is  useful  in  providing  a  basis  for  deciding  what  is  literally  true 
about  quantum  fields,  and  what  is  figurative  or  symbolic.  Thus  the 
physical  folk-theorem :  'Any  two  irreducible  quantum  fields  are  connected 
by  a  unitary  transformation'  is  literally  false,  although  it  has  figurative 
validity,  which  on  the  basis  of  a  further  mathematical  development  can 
be  made  rigorously  explicit.  The  needs  of  field  dynamics  leads  to  this 
development  and  to  a  revision  of  the  present  provisional  notion  of  quan- 
tum field  which  will  be  indicated  later. 

Also  in  need  of  revision  is  the  definition  of  occupation  number  of  a 
single-particle  state.  The  validity  of  the  occupation  number  interpretation 
of  the  given  operator  depends  in  part  on  the  representation  of  the  total 
field  energy  (etc.)  in  terms  of  occupation  numbers  of  states  of  given 
energy,  in  keeping  with  the  idea  that  it  should  equal  the  sum  of  the  pro- 
ducts of  the  various  possible  single-particle  energies  with  the  numbers  of 
particles  in  the  field  having  these  energies.  This  holds  for  a  certain 
mathematically  and  physically  distinguished  quantum  field  in  the  fore- 
going sense,  studied  by  Fock  and  Cook,  often  called  the  'free  field', 
although  actually  of  dubious  application  to  free  incoming  physical  fields, 
and  almost  certainly  inapplicable  to  interacting  fields.  In  any  event,  it 
breaks  down  in  the  case  of  arbitrary  fields,  and  there  has  been  some  un- 
certainty as  to  whether  a  physically  meaningful  particle  interpretation  of 
an  arbitrary  field  could  be  given.  The  solution  to  this  problem  depends 
on  the  proper  integration  of  statistics  with  kinematics,  to  which  we  now 
turn. 


346  I.    E.    SEGAL 

3.  Kinematics.  It  is  axiomatic  that  a  suitable  displacement  of  the 
single-particle  structure  should  effect  a  corresponding  field  displacement. 
In  the  case  of  a  classical  field,  given  say  by  Maxwell's  equations,  it  is 
clear  an  arbitrary  Lorentz  transformation  L  induces  a  transformation 
U(L)  in  the  space  of  solutions.  From  a  quantum-field-theoretic  point  of 
view  however,  U(L)  is  merely  a  displacement  in  the  single-particle  space 
(of  normalizable  photon  states),  and  what  is  needed  is  a  transformation 
V(L)  on  the  field  vector  state  space  K  of  the  preceding  section.  The 
assumption  that  V(L)  exists  means  essentially  that  any  admissible  change 
of  frame  in  ordinary  physical  space  should  give  a  corresponding  transfor- 
mation on  the  field  states.  In  addition,  the  assumed  independence  of 
transition  probability  rates  of  elementary  particle  processes  from  the 
local  frame  of  reference  has  led  to  the  further  assumption  that  V(L)  is  a 
projective  unitary  representation  of  the  Lorentz  group,  in  at  least  the  case 
of  the  'free  incoming'  physical  field. 

In  addition  to  the  Lorentz  group,  there  is  a  group  of  transformations  in 
the  single-particle  vector  state  space  which  plays  an  important  part  in 
nuclear  physics,  and  which  do  not  arise  from  transformation  in  ordinary 
physical  space,  —  namely,  transformations  in  isotopic  spin  space.  In  the 
absence  of  precise  knowledge,  it  is  assumed  that  this  group  acts  indepen- 
dently of  the  Lorentz  group,  but  its  precise  structure  as  an  abstract 
group  is  undecided,  and  it  is  quite  uncertain  whether  it  is  rigorously  true 
that  these  transformations  commute  with  the  action  of  the  Lorentz  group 
on  the  single-particle  space.  There  is  also  the  group  of  guage  transfor- 
mations, which  is  important  in  quantum  electrodynamics,  but  does  not 
have  any  counterpart  in  most  other  elementary  particle  interactions. 
The  improper  Lorentz  transformations  have  recently  been  the  subject 
of  intense  interest.  These  transformations  give  rise  to  outer  automorphisms 
of  the  proper  Lorentz  group,  and  there  seems  to  be  at  present  no  oper- 
ational reason  to  doubt  that  this  is  their  chief  significance  (rather  than  as 
direct  transformations  in  ordinary  space-time),  but  the  experimental 
situation  is  far  from  giving  any  assurance  that  this  is  the  case.  In  the  case 
of  standard  relativistic  theory,  this  leaves  only  charge  and  particle-anti- 
particle  conjugation,  of  which  the  latter  is  connected  with  the  equivalence 
between  particle  and  the  contragredient  anti-particle  transformations, 
and  does  not  appear  to  represent  in  a  natural  way  a  group  element. 
Finally,  these  and  other  kinematical  loose  ends,  together  with  the 
dynamical  divergences,  have  led  certain  scientists  to  investigate  the 


OPERATIONALISM   IN   QUANTUM   MECHANICS  347 

possibility  that  some  other  group  may  give  more  satisfactory  results  than 
the  Lorentz  group,  just  as  this  group  gave  ultimately  a  sounder  theory 
than  the  Galilean  group  of  Newtonian  mechanics,  and  of  which  the  Lo- 
rentz group  will  be  a  type  of  degenerate  form,  just  as  the  Galilean  group  is 
a  degenerate  form  of  the  Lorentz  group. 

On  a  conservative  basis,  it  seems  that  about  all  that  may  legitimately 
be  assumed  of  a  mathematically  definite  character  is  that  there  exists  a 
fundamental  symmetry  group  G,  which  may  reasonably  be  assumed  to 
be  topological,  and  which  acts  linearly  and  continuously  on  the  single- 
particle  vector  state  space.  A  priori  it  might  appear  that  this  is  not  suf- 
ficient as  a  basis  for  an  effective  field  kinematics,  but  it  turns  out  that 
special  properties  of  G  and  of  its  action  on  the  single-particle  space  are 
not  significant  as  regards  the  foundations  of  field  kinematics.  The  main 
desideratum  is  to  establish  the  appropriate  action  of  G  on  the  field,  and 
this  exists  substantially  in  all  cases,  provided  it  is  the  operational  action 
that  is  considered.  That  is  to  say,  the  action  of  G  on  the  state  vectors  of 
the  field,  —  which  in  the  case  of  standard  relativistic  theory  is  given 
formally  in  detail  in  the  recent  treatments  of  field  theory  in  the  literature, 
—  does  not  need  to  exist  in  a  mathematical  sense,  any  more  than  it  exists 
operationally;  but  the  action  of  G  on  the  field  observables,  which  is 
formally  to  transform  them  by  its  action  on  the  state  vectors,  has  effective 
mathematical  existence.  However,  to  this  end  it  is  necessary  to  make  the 
revision  of  the  notion  of  quantum  field  referred  to  above,  to  which  one  is 
naturally  led  by  dynamical  and  further  operational  consideration. 

Before  going  into  these  matters,  we  mention  that  the  generality  of  the 
foregoing  approach  to  kinematics  permits  the  integration  of  the  statistics 
with  the  kinematics.  Any  non-singular  continuous  linear  transformation 
on  the  single-particle  structure  (//,  H'}  preserving  the  fundamental  skew 
form  x.y'  —  u.v'  (x  and  u  arbitrary  in  H,  y'  and  u'  arbitrary  in  H'}  acts 
appropriately  on  the  field  observables;  in  particular  certain-  phase 
transformations  in  the  single-particle  space  so  act,  and  the  occupation 
numbers  are  obtained  as  generators  of  one-parameter  groups  of  such  field 
actions.  A  development  of  this  type  is  needed  for  the  particle  interpre- 
tation of  fields,  if  one  is  to  avoid  the  ad  hoc  assumption  that  the  free 
incoming  physical  field  is  mathematically  represent  able  by  the  special 
representation  referred  to  earlier,  as  well  as  for  dealing  with  the  concept 
of  bound  state. 


348  I.    E.    SEGAL 

4.  Dynamics.  In  conventional  theoretical  physics,  a  dynamical  transfor- 
mation is  represented  by  a  unitary  transformation  mathematically.  In 
the  case  of  an  abstract  algebra  of  observables  as  described  above,  it  has 
however  no  meaning  to  say  that  a  transformation  of  this  algebra  is  given 
by  a  unitary  transformation,  for  this  may  be  true  in  certain  concrete 
representations  of  the  algebra  and  not  in  others.  It  is  clear  though  that 
the  transformation  of  the  observables  determined  by  a  unitary  operator  in 
a  concrete  representation  is  an  automorphism  of  the  algebra.  Since 
operationally  an  automorphism  has  all  the  relevant  features  of  a  dynamical 
(or,  for  that  matter,  kinematical)  transformation,  one  is  led  to  a  gener- 
alization of  conventional  dynamics  in  which  such  a  transformation  is 
axiomatized  as  an  automorphism  of  the  algebra  of  observables.  This  is  a 
proper  generalization,  in  the  sense  that  it  is  not  always  possible  to  re- 
present an  automorphism  of  an  abstract  C*-algcbra  by  a  similarity 
transformation  by  a  unitary  operator  in  a  given  concrete  representation 
space ;  but  what  is  more  relevant  to  field  theory  is  that  even  when  each  of 
a  set  of  automorphisms  can  be  so  represented,  there  will  generally  be  no 
one  representation  in  which  all  of  the  automorphisms  are  so  reprcsentable. 

This  difficulty  docs  not  arise  to  any  significant  extent  in  the  quantum 
mechanics  of  a  finite  number  of  degrees  of  freedom,  for  due  to  a  special 
property  of  finite  systems  of  canonical  variables,  every  automorphism  of 
the  conventionally  associated  algebra  of  observables  can,  in  any  concrete 
representation,  be  induced  by  a  unitary  operator.  But  in  the  case  of  a 
quantum  field,  there  are  simple  apparent  dynamical  transformations  that 
can  be  shown  to  be  not  implementable  by  any  unitary  transformation  in 
the  case  of  the  Fock-Cook  field.  Now  there  is  no  physical  reason  why  every 
self-adjoint  operator  on  the  field  vector  state  space  should  even  in 
principle  be  measurable,  but  it  has  not  been  clear  how  to  distinguish,  in 
effective  theoretical  terms,  those  which  were.  To  arrive  at  such  a  dis- 
tinction, we  consider  that  the  canonical  variables  themselves  should  be 
measurable,  and  also,  in  accordance  with  conventional  usage  in  the  case  of 
a  finite  number  of  degrees  of  freedom,  any  bounded  'function'  of  any  finite 
set  of  canonical  variables.  However,  since  only  finitely  many  particles  are 
involved  in  real  observations,  other  self-adjoint  operators  are  only  doubt- 
fully measurable,  except  that  uniform  limits  of  such  bounded  functions 
must  also  be  measurable,  since  their  expectation  value  in  any  state  is 
simply  the  limit  of  the  expectation  values  of  the  approximating  bounded 
functions.  That  is  to  say,  uniform  approximation  is  operationally  meaning- 


OPERATIONALISM    IN    QUANTUM    MECHANICS  349 

ful,  since  operators  are  close  in  this  sense  if  the  maximum  spectral  value 
of  their  difference  is  small.  The  point  is  now  that  the  simple  apparent 
dynamical  transformations  that  could  not  be  represented  by  unitary 
transformations  in  the  field  state  space  can  however  be  represented  by 
automorphisms  of  the  algebra  of  observables  just  arrived  at  (e.g.  division 
of  the  canonical  p's  by  X  >  1  and  multiplication  of  the  canonical  q's  by  A 
can  be  represented  by  such  an  automorphism,  although  not  by  a  unitary 
transformation  in  the  Fock-Cook  field). 

More  generally,  the  algebra  of  measurable  field  operators  defined  above 
is  the  same  for  all  concrete  quantum  fields  as  defined  earlier.  That  is, 
for  any  two  quantum  fields  (p(.),  </(.))  an<J  (P'(-)>  </'(•))>  relative  to  the 
same  single-particle  structure,  there  exists  an  isomorphism  between  the 
corresponding  algebras  that  takes  any  (say,  bounded  Baire)  function 
of  p(x)  into  the  same  function  of  P'(x)  for  all  x,  and  similarly  for  the  ^'s. 
This  isomorphism  is  in  fact  unique,  from  which  it  can  be  deduced  that  any 
continuous  linear  single-particle  transformation  leaving  invariant  the 
fundamental  skew  form  gives  rise  to  a  corresponding  automorphism  of  the 
algebra.  This  resolves  the  problem  of  defining  the  field  kinematics  when 
the  single-particle  kinematics  is  given. 

For  an  operational  field  dynamics  we  have  to  deal  mainly  (if  not,  indeed, 
exclusively)  with  the  particular  transformation  that  connects  the  so-called 
incoming  and  outgoing  free  fields,  which  may  be  defined  as  the  scattering 
automorphism.  In  view  of  the  uniqueness  of  the  algebra  of  field  observables, 
it  does  not  matter  in  which  representation  this  automorphism  is  given. 
Tied  up  with  these  notions  are  those  of  the  physical  vacuum  state, 
physical  particle  canonical  variables  and  occupation  numbers,  and  the 
scattering  operator.  Since  what  is  more-or-less  directly  observed  for 
quantum  field  phenomena  is  interpretable  as  the  scattering  of  an  incoming 
field  of  particles,  it  is  appropriate  to  attempt  to  formulate  these  various 
notions  in  terms  of  agi  ven  scattering  automorphism  Y .  The  physical  vacuum 
state  must  certainly  satisfy  the  condition  of  invariance  under  s.  This  will 
in  general  not  give  a  unique  state,  but  it  is  fairly  reasonable  to  assume  that 
in  a  realistic  theory,  the  additional  requirement  of  invariance  under  the 
kinematical  action  of  a  maximal  abelian  subgroup  of  the  fundamental 
symmetry  group  may  well  give  uniqueness.  The  axiom  of  covariance 
asserts  that  s  commutes  with  the  kinematical  action  of  the  entire  symmetry 
group  on  the  field  observables,  and  from  this  and  a  well-known  fixed-point 
theorem  the  existence  of  a  physical  vacuum  as  so  defined  follows. 


350  I.    E.    SEGAL 

Given  a  state  of  an  abstract  C*-algebra  that  is  invariant  under  an 
abelian  group  of  automorphisms,  there  corresponds  in  a  well-known 
mathematical  manner,  a  concrete  representation  of  the  algebra  on  a 
complex  Hilbert  space  K,  and  a  unitary  representation  of  the  abelian 
group  on  the  space,  which  give  similarity  transformations  effecting  the 
automorphisms.  In  this  way  there  is  determined  the  unitary  scattering 
operator  5,  which  in  this  particular  representation  implements  the 
automorphism  s,  and  a  unitary  representation  of  the  maximal  abelian 
subgroup  of  the  covariance  group.  The  vacuum  state  is  represented  by  a 
vector  of  K,  left  invariant  by  5  and  this  unitary  representation.  (In  the 
application  to  standard  relativistic  theory,  the  abelian  subgroup  would 
consist  of  translations  in  space-time,  which  in  conventional  theory  leaves 
only  the  physical  vacuum  fixed,  among  all  physical  states.)  The  incoming 
field  is  defined  as  that  given  by  the  representation,  and  the  outgoing  field 
as  its  transform  under  5,  both  having  the  vector  state  space  K ;  to  avoid 
subtle  and  technical  mathematical  questions  in  this  connection  the 
physically  plausible  assumption  of  continuity  of  the  physical  vacuum 
expectation  values  of  the  Ael^(x^B  and  AeWv^B,  at  least  when  x  and  y' 
range  over  finite-dimensional  subspaces  of  the  single-particle  space,  is 
made,  where  A  and  B  are  fixed  but  arbitrary  field  observables.  The 
p(x)  and  q(y')  that  generate  the  homomorphic  images  of  the  one-para- 
meter groups  [eMv(x}:  —  oo  <  t  <  oo]  and  [etWv'*:  —  oo  <  t  <  oo]  are 
defined  as  the  canonical  variables  of  the  free  incoming  physical  field,  and 
their  transforms  under  5,  those  of  the  outfield.  In  defining  single-particle 
state  occupation  numbers,  it  is  convenient  to  assume  present  a  distin- 
guished complex  Hilbert  space  structure  in  the  direct  sum  //  +  //'.  For 
any  single-particle  state  vector  x  in  //  +  //',  there  is  then  a  unique 
continuous  one-parameter  unitary  group  [U(t) :  —  oo  <  t  <  oo]  taking  x 
into  eux  and  leaving  fixed  the  orthogonal  complement  of  x.  The  corre- 
sponding automorphisms  of  the  algebra  of  field  observables  likewise  form 
a  one-parameter  group.  In  general  they  will  not  leave  invariant  the  phy- 
sical vacuum  state,  but  again  making  physically  plausible  continuity  and 
boundedness  assumptions,  there  will  be  obtained  finally  a  corresponding 
one-parameter  group  of  linear  transformations  in  K,  which  will  have  a  *di- 
agonalizable'  generator,  i.e.  one  similar  (in  general,  via  a  non-unitary 
transformation)  to  a  self-adjoint  operator.  Although  these  occupation 
numbers  are  not  self -ad  joint,  they  have  the  crucial  properties  of  having 
integral  proper  values;  of  being  such  that  the  total  in-field  energy, 
momentum,  etc.  the  sum  of  the  products  of  all  single-particle  energies, 


OPERATIONALISM   IN   QUANTUM   MECHANICS  351 

momenta,  etc.  with  the  occupation  numbers  of  the  corresponding  states  in 
a  formal,  but  partially  rigorizable,  manner;  and  of  annihilating  the 
physical  vacuum  state  vector. 

The  fundamental  problem  of  quantum  field  dynamics  from  an  overall 
point  of  view  is  and  always  has  been  that  of  the  so-called  divergences.  In 
present  terms,  this  is  the  problem  of  establishing  the  existence  of  the 
scattering  automorphism  s,  which  must  satisfy  certain  conditions,  which 
however  can  not  be  stated  with  mathematical  precision,  this  lack  of 
precision  being  an  inherent  difficulty  of  the  problem.  That  the  present 
approach  may  well  be  relevant  to  this  problem  may  be  seen  in  the  follow- 
ing way.  The  scattering  automorphism  may  be  given  as  an  infinite  product 
integral,  and  the  crucial  difficulty  has  always  been  that  of  establishing  the 
existence  of  the  integrand.  This  is  given  formally  by  a  complex  exponential 
of  the  integral  at  a  particular  time  of  the  'interaction  Hamiltonian',  whose 
character  is  relevant  here  only  to  the  extent  that  in  a  variety  of  interetsing 
and  typical  cases,  it  is  a  linear  expression  in  the  canonical  p's  and  ^'s, 
whose  coefficients  are  relatively  un troublesome  operators.  E.g.  in  certain 
current  theories  of  meson-nucleon  interaction,  they  are  simply  finite- 
dimensional  matrices;  for  fully  quantized  electrodynamics  'in  a  box'  they 
are  mutually  commutative  self-adjoint  operators  in  a  Hilbert  space.  Now 
there  is  no  doubt  that  these  formal  operators  are  divergent,  in  the  sense 
that  they  do  not  represent  bona  fide  self-adjoint  operators  in  the  Fock- 
Cook  representation,  —  in  fact  their  domains  in  general  appear  to  consist 
only  of  {0}.  But  in  dealing  with  these  formal  operators,  we  are  at  liberty 
to  change  the  representation  employed  for  the  canonical  p's  and  q's 
according  to  the  foregoing  development.  Now  it  can  be  shown  that  there 
always  exists  a  representation  for  which  Z^  q^  X  T^  represents  in  an 
obvious  manner  a  bona  fide  hcrmitian  operator,  provided  that  each 
Tjc  is  a  bounded  operator.  One  can  deal  similarly  with  E^  (qjeXTje+pjcX 
X  Vk)  when  the  Tjc  and  Vk  are  mutually  commutative  self -adjoint 
operators.  In  either  case  the  complex  exponential  will  be  a  well-defined 
unitary  operator.  Thus  although  the  final  physical  results  are  independent 
of  the  representations  employed  in  setting  up  the  theory,  the  divergence 
or  convergence,  as  operators  in  Hilbert  space,  of  expressions  involved  in 
the  analysis,  may  depend  strongly  on  the  representation.  l 

1  For  a  more  detailed  account  of  certain  physical  points,  as  well  as  references  to 
proofs  of  relevant  mathematical  results,  see  Segal  [2]  and  [3].  For  another  approach 
to  the  axiomatics  of  quantum  field  theory  from  a  partially  heuristic  point  of  view, 
but  with  points  of  contact  with  the  present  approach,  see  R.  Haag  [1]. 


352  I.    E.    SEGAL 

Bibliography 

[1]    HAAG,  R.,  On  quantum  field  theories.  Matcmatisk-Fysiske  Mcddelelser  udgivct 

af  del  Kgl.  Danske  Videnskabernes  Selskab.  29  (1955). 
[2]    SEGAL,  I.E.,  The  mathematical  formulation  of  the  measurable  symbols  of  quantum 

field  theory  and  its  implications  for  the  structure  of  free  elementary  particles.  To 

appear  in  the  report  of  the  International  Conference  on  the  Mathematical 

Problems  of  Quantum  Field  Theory  (Lille,  1957). 
[3]     ,  Foundations  of  the  theory  of  dynamical  systems  of  infinite  sly  many  degrees 

of  freedom.   I.  Matematisk-Fysiske  Meddelelser  udgivet  af  det  Kgl.  Danske 

Videnskabernes  Selskab.  31  (1959). 


Symposium  on  the  Axiomatic  Method 


QUANTUM  THEORY  FROM  NON-QUANTAL  POSTULATES 

ALFRED  LAND£ 

Ohio  State  University,  Columbus,  Ohio,  U.S. A . 

1.  Physical  and  Ideological  Background.  Theoretical  physics  aims  at 
deducing  formal  relations  between  observed  data  by  the  combination  of 
simple  and  general  empirical  propositions  which,  if  true,  will  'explain'  the 
variety  of  phenomena.  In  the  process  of  constructing  a  physical  theory  on 
a  postulational  basis  one  may  distinguish  between  three  steps.  First,  by 
critical  evaluation  of  experience  one  arrives  at  ideological  pictures  for  the 
connection  of  individual  data  (e.g.  for  the  'path'  of  a  firefly,  Margenau) 
and  at  general  notions  expressed  in  everyday  language  which  takes  much 
for  granted  and  may  involve  circularity  in  the  definition  of  terms.  Second, 
the  resulting  picture  is  formalized  and  condensed  into  general  laws. 
Third,  the  formal  laws  are  now  put  in  correspondence  with  a  physical 
'model'  which  gives  an  operational  definition  of  each  symbol,  resulting  in 
a  self-consistent  physical  theory.  In  spite  of  its  vagueness,  step  1  is  of 
importance  to  the  physicist  since  it  furnishes  a  legitimate  basis  for  his 
selection  of  one  formalism  among  many  possible  ones  as  the  formal  sub- 
structure of  his  laws. 

The  quantum  theory  in  its  historical  development  has  followed  this 
procedure,  its  laws  are  based  today  on  a  few  universal,  though  rather 
baffling,  principles,  the  most  prominent  among  them  being  those  of 
wave-particle  duality,  qp-uncertainty ,  and  complementarity.  I  submit, 
however,  that  the  process  of  reduction  has  not  gone  far  enough,  and 
that  the  quantum  principles  just  mentioned  can  be  reduced  further  to 
simple  empirical  propositions  of  a  non-qitantal  character,  the  combination 
of  which  yields  the  quantum  principles  as  consequences.  The  latter  can 
thus  be  'explained'  on  an  elementary  and  more  or  less  familiar  back- 
ground "so  that  our  curiosity  will  rest"  (Percy  Bridgman),.  Conforming 
with  step  1  above,  I  begin  with  considerations  of  a  somewhat  vague 
character  in  order  to  lay  the  ideological  groundwork  for  the  formal 
substructure  of  quantum  mechanics.  —  Two  objects,  A  and  B,  or  two 
'states"  A  and  B  of  the  same  'kind'  of  object,  may  be  said  to  be  different, 
written  A  ^  B,  when  A  and  B  are  discernible,  i.e.  separable  by  means  of 

353 


354  ALFRED   LANDE 

some  device,  shortly  denoted  as  a  'filter',  responding  to  B  with  'no'  when 
B  ^  A,  and  with  'yes'  when  B  —  A,  as  depicted  by  Figs.  \a  and  Ib  where 
A  is  written  for  different  from  A  or  non-A.  The  term  'state',  'filter',  'kind' 
of  system  (atom)  are  introduced  without  operational  definition;  they 
happen  to  correspond  to  actual  situations  in  microphysical  experiments, 
however. 


-  A 


Fig.   la  Fig.   Ib  Fig.   \c 

As  an  illustration,  A  may  signify  a  state  of  vertical  orientation  of  the 
molecular  axis  of  a  certain  kind  of  particle,  and  the  A  -filter  may  be  a 
screen  with  a  vertical  slit.  State  A  may  be  a  state  of  horizontal  orientation 
of  the  same  particle,  so  that  the  A  -filter  blocks  A  -state  particles. 

Imagine  now  that,  starting  from  a  state  R  ^  A  (Fig.  Ib)  one  gradually 
'changes'  state  B  so  that  it  becomes  'more  similar'  to  A  (again  no  oper- 
ational definition  of  the  terms  in  quotation  marks  is  given).  One  may 
expect  a  priori  that  an  abrupt  change  from  Fig.  \b  to  la  will  take  place 
only  in  the  last  moment  when  B  becomes  exactly  equal  to  A  .  The  postu- 
late of  continuity  of  cause  and  effect  requires,  however,  that  a  gradual 
change  from  B  ^  A  to  B  =  A  as  cause  will  lead  to  a  gradual  change  of 
effect,  from  all  B's  blocked  to  all  B's  passed  by  the  A  -filter.  More  precisely, 
the  continuity  postulate  requires  that  there  be  intermediate  states  B 
between  B  ^  A  and  B  =  A  ,  with  results  intermediate  between  Fig.  1  b 
and  la,  that  is,  with  some  B's  passing  and  some  rejected,  as  pictured  in 
Fig.  \c\  such  cases  then  signify  a  'fractional  equality'  between  B  and  A, 
written  B  ^  A  .  The  ratio  between  passed  and  repelled  /Estate  particles 
can  only  be  a  statistical  ratio,  i.e.  a  probability  ratio  for  an  individual  B- 
state  particle.  Individual  indeterminacy  controlled  by  statistical  ratios  is  a 
consequence  of  the  continuity  postulate  for  cause  and  effect.  The  passing 
fraction  written  P(B,  A)  of  Z?-state  particles  through  the  A  -passing  filter 
may  be  taken  as  an  operational  definition  of  the  fractional  equality  degree 
between  the  states  A  and  B,  of  value  between  0  and  1  .  And  since  equality 
degrees  ought  to  be  mutual,  one  will  introduce  the  symmetry  postulate, 
P(A,B)  —  P(B,A)'t  the  latter  is  physically  justified  as  the  statistical 
counterpart  of  the  reversibility  of  deterministic  processes.  It  stipulates 
that  the  statistical  fraction  of  #-state  particles  passed  by  an  A  -passing 


QUANTUM  THEORY  FROM  NON-QUANTAL  POSTULATES         355 

liter  equals  the  statistical  fraction  of  A  -state  particles  passing  a  ^-filter. 

Similar  considerations  apply  to  any  game  of  chance  with  the  alter- 
tative  'yes'  or  'no',  passed  or  blocked,  right  of  left,  etc.  For  example, 
vhen  balls  are  dropped  from  a  chute  upon  a  knife  edge,  they  will  drop  to 
he  right  or  to  the  left,  depending  on  the  aim  of  the  chute. 

According  to  the  continuity  postulate,  however,  there  ought  to  be  a 
ontinuity  of  cases  between  all  balls  to  the  right  and  all  to  the  left, 
iccurring  within  a  small  range  of  physical  aim,  with  statistically  ruled 
atios  of  r-  and  /-balls,  gradually  changing  from  1 00 : 0  to  0 : 1 00  when  the 
physically  regulated  aim  of  the  chute  is  changed  from  one  to  the  other  end 
i  the  small  angular  range.  Hypothetical  reservaetions  about  concealed 
auses  for  individual  r-  and  /-events  would  never  explain  the  miracle  of 
statistical  cooperation'  of  individual  events  yielding  fixed  statistical 
atios  [1],  [2]. 

Next  we  introduce  the  empirical  postulate  of  reprodiicibility  of  a  test 
esult  which  stipulates  that  a  #-state  particle  in  Fig.  \c  which  has  once 
>assed  the  A  -filter  will  pass  an  ^4 -filter  again  with  certainty.  This  harmless 
ooking  postulate  implies  that  the  incident  Z?-state  particle,  in  the  first 
.ct  of  passing  the  A  -filter,  must  have  changed  its  state  from  B  to  A. 
ndeed,  only  thus  will  it  pass  another  A  -filter  again  with  certainty. 
Similarly,  an  incident  /?-state  particle  once  repelled  by  the  A  -filter  must 
lave  jumped,  by  virtue  of  its  first  repulsion,  from  B  to  the  new  state  A 
o  that  it  will  be  repelled  again  if  tested  once  more  by  the  A  -filter.  Dis- 
ontinuous  changes  of  state  (transitions,  jumps)  in  reaction  to  a  testing 
nst rumen  t  can  thus  be  seen  as  consequences  of  the  postulate  of  repro- 
'ucibility  of  a  test  result  and  continuity  of  cause  and  effect.  To  these 
>ostulates  we  have  added  that  of  symmetry,  P(A,  B)  =P(B,A),  in 
/hich  P  now  assumes  the  meaning  of  a  transition  probability  from  state  B 
o  A  in  an  yl-filtertest,  and  from  A  to  B  in  a  /Milter test. 

2.  The  Probability  Schema.  After  these  ideological  preparations  we 
ome  to  the  mathematical  schema  of  the  probabilities  of  transition. 
Consider  a  class  of  entities  S  (=  'states'  of  a  given  atom)  which  are  in  a 
nutual  relation  of  'fractional  equality'  Sm  ~  Sn,  quantitatively  de- 
cribed  by  positive  fractional  numbers,  P(Sm,  Sn),  denoted  as  'equality 
factions'.  Special  cases  are  P  =  0  (separability,  total  inequality  of  Sm 
nd  Sn)  and  P  =  1  (identity,  inseparability).  The  P-relations  permit  a 
Division  of  the  elements  of  class  5  into  subclasses,  the  subclass  A  with 
lembers  A  \A  z . . .  which  satisfy  the  orthogonality  relation 


356  ALFRED    LANDE 

(1)  P(AmAm-)  =dmm>, 

the  subclass  B,  and  C,  and  so  forth.  (The  selection  of  complete  orthogonal 
subclasses  out  of  the  entirety  of  entities  5  is  not  unique,  a  fact  known  to 
the  quantum  theorist  as  'degeneracy'). 

P-values  connecting  the  elements  of  two  subclasses  such  as  A  and  B 
may  be  arranged  in  a  matrix: 


P(Ai,Bi)     P(Ai,B2) 
P(A*,Bi) 


(2) 


The  physical  interpretation  of  the  P's  as  probabilities  of  transition  in 
tests  justifies  the  postulate  that  the  sum  of  the  transition  probabilities 
from  any  one  state  Am  to  the  various  states  BiB%. . .  be  unity,  i.e.  that 
each  row  of  the  matrix  (2)  sums  up  to  unity.  Furthermore,  according  to 
the  symmetry  postulate 

(3)  P(Am,Bn)=P(BH,Am), 

the  columns  of  the  matrix  (PAB)  are  the  rows  of  the  matrix  (PBA)  so  that 
the  columns  of  (PAB)  also  have  sum  unity; 

{3')  Zn  P(Am  ,Bn)  =  1   and  Zm  P(Am,  B n)  =  1. 

Suppose  now  that  the  matrix  (2)  has  M  rows  and  N  columns.  The  sum  of 
all  its  elements  would  then  be  M  when  summing  the  rows,  and  N  when 
summing  the  columns.  Thus  M  —  N,  that  is,  the  matrices  (PAB)  and 
(PAC)  etc.  must  be  quadratic,  and  the  subclasses  A ,  B,  C,  ...  must  all  have 
the  same  multiplicity,  M.  The  multiplicity  M  of  the  orthogonal  sets  of 
states  may  be  finite  or  infinite  depending  on  the  'kind'  of  particle.  The 
P-matrices  are  unit  magic  squares. 

3.  The  Probability  Metric.  We  now  introduce  the  further  postulate  that 
the  various  P-matrices  are  interdependent  by  virtue  of  a  general  law 
according  to  which  one  matrix  (P)  in  a  group  is  determined  by  the  other 
matrices  (P)  of  the  same  group.  Only  the  following  simple  interdependence 
laws  between  two-index  quantities  are  feasible : 

(4)  the  addition  law  UAC  =  UAB  +  UBC 
made  self-consistent  by  UAB  =  —  UBA 

and  corresponding  laws  for  distorted  quantities  W  =  f(U),  e.g.  for  W  =  eu. 


QUANTUM  THEORY  FROM  NON-QUANTAL  POSTULATES         357 

(5)  the  multiplication  law  WAC  =  WAB  -  WBC 

made  self-consistent  by  WAB  •  WBA  =  1 

There  is  no  other  conceivable  way  of  making  UAC  or  WAC  independent 
of  the  choice  of  the  intermediate  entity  B  than  the  addition  theorem  (4) 
and  its  generalization  by  distortion. 

A  model  of  (4)  is  furnished  by  the  geometry  of  lengths  LAB,  LAC,  etc. 
in  frameworks  connecting  points  A  ,  B,  C,  .  .  .  .  Although  (4)  cannot  be 
applied  to  the  lengths  L  themselves,  it  may  be  applied  to  a  substructure  of 
quantities  9?  satisfying  the  triangular  relation  VAC  =  <PAB  +  <PBC  with 
<PAB  —  —  <PBA>  known  as  vectors.  The  latter  determine  the  lengths  L  =  \<p\. 
Of  particular  interest  is  plane  geometry  where  vectors  <p  can  be  written  as 
complex  symbols,  <p  =  \<p\  .ei0i.  Also  in  a  plane,  5  points  are  connected  by 
10  lengths;  when  9  of  them  are  given  they  uniquely  determine  the 
tenth  L. 

In  order  to  construct  a  law  of  interdependence  between  unit  magic 
squares  one  may  start  from  (5),  Although  (5)  cannot  be  applied  to  the 
matrices  (P)  themselves,  it  may  be  applied  to  a  substructure  of  quantites 
V  which  are  to  satisfy  the  matrix  multiplication  formula 

(6)  (VAC)  =  (VAB)-(VBC),  with  (yAA)  =  (VAB)-(VBA)  =  (0- 

When  now  decreeing  (the  asterisk  standing  for  the  complex  conjugate)  : 

(7)  v(A*9  Bn)  =  v*(Bn,  At)  and  P  =  |y|2, 


the  P-matrices  become  unit  magic  squares,  as  required.  (6)  is  known  as 
the  law  of  unitary  transformation,  connecting  'orthogonal  axes  systems' 
A  and  B  etc.  by  'complex  directional  cosines'  y>.  A  tensor  /  in  general 
obeys  the  transformation  formula 

(8)  (fAD)  =  (vAB).(fBc).(ycD). 

To  the  physicist,  the  quantities  y;  are  the  'probability  amplitudes'  which 
satisfy  the  law  of  interference  (6),  and  the  tensors  /  are  'observables'.  When 
/  has  its  eigenvalues  in  the  states  FiF%  .  .  .  that  is,  when 

(9)  f(Fn,Fn.)=f(Fn).dnn', 
then,  as  a  special  case  of  (8),  one  has 

(9')  f(At,  A,)  =  Zn  V(Ak,  Fn)  .f(Fn).V(Fn,  At). 

The  y-interference  law  and  the  corresponding  transformation  law  for 


358  ALFRED  LAND£ 

observables  was  first  found  inductively  and  was  considered  as  a  most 
surprising  empirical  law  of  nature.  It  turns  out  to  be  the  only  conceivable 
solution  of  the  mathematical  problem  of  finding  a  general  self-consistent 
law  connecting  unit  magic  squares,  viz.  the  law  of  unitary  transformation. 
In  opposition  to  numerous  physicists  who  see  in  the  interference  law 
for  complex  probability  amplitudes  a  profound  and  unfathomable  plan  of 
nature  presenting  us  with  an  abstract  and  unpictorial  substructure  of 
reality  manifest  in  a  wave-particle  duality,  it  may  be  noticed  that 
(a)  each  complex  y>  may  be  pictured  as  a  vector  in  a  plane  giving  direction 
to  the  corresponding  probability  P  so  that  the  P-metric  can  be  visualized 
as  a  structural  framework  of  lines  in  a  plane,  and  (b)  similar  to  plane 
geometry  where  5  points  A,  B,  C,  D,  E  are  connected  by  10  lengths  LAB, 
LAC,  etc.  and  9  L's  uniquely  determine  the  tenth  L,  so  are  there  direct 
relations  between  the  10  unit  magic  square  matrices  (PAB),  (PAC),  etc. 
which  connect  5  orthogonal  sets  of  states  so  that  9  P-matrices  uniquely 
determine  the  tenth.  That  is,  there  are  direct  relations  between  the  real 
probabilities  P  which  can  be  formulated  without  resorting  to  complex 
quantities  y>  with  wave-like  phase  angles. 

4.  Quantum  Periodicity  Rules.  The  quantum  theorems  of  Born  and 
Schrodinger 

(10)  (qp  —  pq)  =  h\2in  and  p  =  (h/2in)d/dq 

are  equivalent  to  the  rule  that  the  amplitude  function  yj(q,  p)  is  a  complex 
exponential  function 

(11)  \p(q,  p}  —  exp(iqp/const) 

with  const  =  h/2ji.  The  quantum  rules  (10)  or  (11)  are  usually  introduced 
ad  hoc  as  inductive  results  of  quantum  experience.  I  am  going  to  show 
that  they  are  consequences  of  the  following  postulates  added  to  those 
introduced  before : 

a)  Linear  coordinates  q  and  linear  momenta  p  are  physically  defined 
up  to  additional  constants  so  that  there  are  observables  whose 
values  depend  on  q-differences  and  on  p-differences  only. 

b)  The  statistical  density  of  conjugates  q  and  p  is  constant  in  ^/>-space 
(as  it  is  in  classical  statistical  mechanics). 

The  proof  of  (11)  on  the  grounds  of  (a)  (b)  rests  on  the  fact  that  the 
complex  exponential  function,  f(x)  =  exp(z#/const)  is  the  only  function 
f(x)  which,  together  with  its  complex  conjugate  f*(x),  satisfies  the  condi- 


QUANTUM  THEORY  FROM  NON-QUANTAL  POSTULATES         359 

tion  that  the  product  f(xi)  .f*(xz)  will  depend  on  the  difference  x\  —  x% 
only. 

The  detailed  proof  runs  as  follows.  As  a  special  case  of  (9')  for  an 
observable  /  defined  as  a  function  of  q  one  has 

t(P*>  Pi)  =  Zn  V(pk,  qn)f(qn)V'(qn,  Pj)- 

If  q  is  a  linear  coordinate  running  continuously  from  —  oo  to  +  oo,  and 
for  given  ^-values  has  constant  |y>|2  density,  the  last  formula  becomes  an 
integral  with  constant  weight  factor  in  the  integrand: 


(  1  2)  t(t>*.  Pi)  =  /  V(P*.  q)f(9)v(9,  Pi)dq 

Since  f(q)  may  be  any  observable  whatsoever,  one  may  consider  the  case 
that  it  is  a  <3-function  with  maximum  at  any  chosen  place  qi  ;  the  integral 
then  reduces  to 

t(Pk>  Pi)  =  V>(Pk,  qi)V*(Pi>  qi). 

If  the  'transition  value1  f(pjc,  pj)  is  to  depend  on  the  difference  pk  —  PJ 
only,  the  function  y>  on  the  right  must  contain  p  in  the  form 

(13)  V(q,p)  =  exp(...  *>...). 

An  analogous  consideration  applied  to  an  observable  g(p)  which  may  be 
chosen  as  a  ^-function  yields  the  result  that  the  function  *P  must  contain 
q  in  the  form 

(13')  V(q,p)  =  exp(...  *?...). 

(13)  and  (13')  together  leave  only  the  following  alternative:  Either 
y(q,  p)  is  of  the  form 

y>(q,  p)  =  cxp(ociq  +  pip) 
with  separate  real  constant  factors  a  and  /?,  or 

(14)  y(q,p)  =  ^xp(iyqp) 

with  common  real  factor  y.  The  first  alternative  would  lead,  according 
to  (12)  to 

(#*  -  pj)]ff(q).dq  =  exp(ta(#jt  -  Pi)]-  const, 


where  the  left  hand  side  depends  on  the  choice  of  the  function  /,  whereas 
the  right  hand  side  does  not.  Only  the  second  alternative  makes  sense. 
When  writing  h/2n  for  y  Eq.  (14)  it  is  identical  with  (11),  q.e.d.  Eq.  (1  1)  is 


360  ALFRED  LAND£ 

the  fundamental  wave  function  of  quantum  dynamics  with  wave  length 
A  =  h/p. 

For  completeness  sake  we  add  the  well-known  deduction  of  the  symme- 
try theorems  which  are  of  such  decisive  importance  for  the  aggregation 
of  identical  particles.  Identity  of  two  particles  a  and  b  signifies  their 
indiscernibility  and  in  particular  equality  of  the  two  transition  probabilities 


or  omitting  reference  to  S  : 

2  = 


This  equation  can  be  satisfied  only  when  ip  is  either  symmetric  or  anti- 
symmetric with  respect  to  an  exchange  of  the  letters  a  and  b,  proved 
as  follows.  Write 


it  bf)  = 
= 
Similarly 

y(bif  aj)  =  &ym(«,  b)  -  <£anfc(«,  b) 
Taking  the  absolute  squares  of  the  two  last  equations  one  arrives  at 

P(ai,  bj)  -  l^syml2  +  l^antl2  +  real  part  of  (2<£sym<£ant*) 
P(bt,  aj)  =       same      real  part  of  same 

The  two  P's  can  be  equal  only  when  either  <f>8ym  or  <f>ant  vanishes,  i.e. 
(excluding  the  trivial  case  of  y  ==  0)  when  either  y  =  <t>ant  or  \p  =  faym* 
q.e.d. 

For  systems  of  three  or  more  identical  particles  y(a,  b,  c,  .  .  .  )  must 
either  be  symmetric  with  respect  of  the  exchange  of  each  pair,  or  anti- 
symmetric. Indeed,  if  y  were  symmetric  with  respect  to  a  and  b,  but 
antisymmetric  with  respect  to  a  and  c,  one  would  arrive  at  the  following 
sequence  : 

+  y(a,  6,  c)  =  +  y(b,  a,  c)  =  —  y)(b,  c,  a)  =  —  y(a,  c,  b)  = 

=  +  V>(c>  a>  ty  —  +  V(c>  b,a)  =  —  y(a,  b,  c) 
which  is  self-contradictory.  All  particles  are  thus  divided  in  two  classes, 


QUANTUM  THEORY  FROM  NON-QUANTAL  POSTULATES         361 

those  which  form  symmetric,  and  those  which  form  antisymmetric  y- 
functions. 

This  concludes  the  deduction  of  the  quantum  theorems  from  basic 
postulates  of  a  non-quantal  character. 

5.  Quantum  Fact  and  Fiction.  A  few  remarks  may  be  added  concerning 
the  present  quantum  philosophy,  reputedly  the  most  revolutionary 
innovation  in  the  theory  of  knowledge  of  the  century.  Its  starting  point  is 
the  allegation  that  quantum  theory  has  invalidated  the  notion  of  objective 
states  possessed  by  a  microphysical  system  independent  of  an  observer 
(according  to  some  authorities)  or  independent  of  a  measuring  instrument 
(according  to  others).  And  the  quantity  y  is  said  to  have  a  particularly 
' subjective'  character  in  so  far  as  it  expresses  expectations  of  an  observer, 
rather  than  states  of  an  atom,  y  is  also  reputed  to  be  'abstract'  and 
'unanschaulich'  (unpictorial)  due  to  its  complex-imaginary  form. 

In  the  writers  opinion,  this  quantum  philosophy  rests  on  various 
misunderstandings  and  fictions.  First,  complex  quantities  stand  for 
vectors  in  a  plane ;  hence  y>  gives  direction  to  the  transition  probabilities 
so  that  the  latter  form  a  structural  framework  in  a  plane.  The  ^-multi- 
plication law  (6)  is  quite  analogous  to  the  geometrical  vector  addition 
law  VAC  =  <f>AB  +  VBC-  But  nobody  has  yet  found  plane  geometry 
abstract  and  unpictorial  because  it  connects  real  lengths  by  vectors 
which  could  be  symbolized  by  complex  numbers. 

Second,  since  a  test  resulting  in  the  state  A  m  of  an  atom  is  reproducible 
by  means  of  the  same  A  -meter,  one  may  legitimately  denote  the  state 
Am  as  being  'objectively  possessed'  by  the  atom.  It  is  true  that  a  sub- 
sequent ZMest  throws  the  atom  into  a  new  (equally  reproducible)  state 
Bn.  Thus  one  does  not  have  the  right  to  say,  or  even  to  imagine,  that  the 
atom  is  in  the  two  states  Am  and  Bn  simultaneously;  the  two  states  are 
'incompatible'.  But  incompatibility  as  such  is  nothing  novel  and  revo- 
lutionary. A  state  of  angular  twist  value  w  of  a  rod  of  ice,  and  a  viscosity 
value  v  of  the  same  sample  in  the  liquid  state  are  mutually  incompatible ; 
there  are  no  combination  w-states.  It  is  significant  of  quantum  dynamics 
that  a  state  q  and  a  state  p,  though  individually  reproducible,  do  not  allow 
reproducible  'objective'  ^-states;  and  if  an  objective  0-state  has  been 
ascertained  one  must  not  even  imagine  any  hidden  simultaneous  p- value 
to  prevail.  But  this  is  not  initiating  a  new  philosophy  of  knowledge.  It 
merely  tells  us  to  be  careful  with  the  application  of  the  term  'objective 
state'.  Of  course,  physicists  are  more  impressed  by  the  example  of  qp- 


362  ALFRED    LANDE 

incompatibility  than  by  the  trivial  example  of  ^-incompatibility.  Yet 
after  thirty  years  of  emphasizing  differences,  one  may  as  well  begin  stress- 
ing similarities  between  quantum  physics  and  everyday  experience. 

Third,  in  this  connection  one  ought  to  remember  that  statistical  law, 
as  opposed  to  classical  determinism,  is  known  from  ordinary  games  of 
chance ;  they,  too,  confront  us  with  the  'miracle  of  statistical  cooperation' 
of  individual  events  irreducible  in  principle  [1],  [2]  to  hidden  causes. 
There  is  no  structural  difference  between  the  ordinary  ball-knife  game  cles- 
ribed  above  and  the  quantum  game  of  Fig.  \c. 

Fourth  a  great  issue  has  been  made  of  y  being  a  subjective  expectation 
function  which  suddenly  collapses  or  contracts  in  violation  of  the  'wave 
equation'  when  a  definite  observation  is  made,  turning  potentiality  into 
actuality.  However,  in  spite  of  subjectively  tainted  words  'expectation' 
and  'probability',  the  quantum  theory,  like  any  other  theory  in  physics, 
correlates  experimental  data  rather  than  mental  states;  in  particular  it 
correlates  statistical  experience  gained  in  tests  of  atoms  with  macroscopic 
instruments.  If  someone  uses  these  statistical  laws  (which  are  of  the 
same  quality  as  the  Gauss  law  of  errors)  for  placing  bets  or  for  enjoying 
anticipations  of  future  events,  this  is  his  personal  affair  and  has  nothing 
to  do  with  the  quantum  theory.  (Similarly,  nobody  has  yet  found  a 
subjective  element  in  Gauss'  error  law,  or  in  Newton's  law  of  attraction 
because  astronomers  anticipate  eclipses  with  high  accuracy).  The  fiction 
that  quantum  theory  deals  with  differential  equations  for  expectations 
rather  than  with  the  correlation  of  objective  data  which  never  collapse, 
has  instilled  utter  confusion  into  the  'quantum  theory  of  measurement'. 
Here  we  learn  that  a  ^-function,  after  first  developing  according  to  the 
Schrodinger  equation  as  a  kind  of  'process  equation  of  motion',  suddenly 
collapses  whenever  a  point  event  takes  place  (according  to  some  authori- 
ties) or  only  when  an  observer  takes  notice  of  the  point  event  (according 
to  others) .  But  since  nobody  can  seriously  believe  in  such  inconsistencies, 
one  tries  at  least  to  talk  away  the  difficulty,  as  testified  by  extended  dis- 
cussions at  many  symposiums  on  'measurement'  during  the  last  thirty 
years.  The  chief  trouble  is  the  mistaken  view  that  the  Schrodinger 
equation  describes  a  physical  change  of  state,  either  individually  or 
statistically.  Actually  if  connects  various  mathematical  'representations' 
of  one  and  the  same  fixed  state  with  one  another,  be  it  the  fixed  state  A 
before  the  measurement,  or  B  after  the  measurement  [3],  [4],  [5],  [6]. 

Fifth,  confusion  prevails  also  with  respect  to  the  famous  waveparticle 
duality.  In  fact  the  latter  has  become  illusory  since  Max  Born  thirty 


QUANTUM  THEORY  FROM  NON-QUANTAL  POSTULATES         363 

years  ago  introduced  the  statistical  particle  interpretation  of  the  'wave 
function'  and  thereby  restored  a  unitary  particle  theory,  following  a  short 
period  of  doubt  whether  matter  really  consisted  of  waves  or  of  particles. 
Before  Born  it  was  considered  philosophical  to  argue  that  neither  waves 
nor  particles  are  'real';  but  the  same  pseudo-philosophical  talk  has  sur- 
vived although  physicists  in  their  sober  hours  consider  particles,  and 
particles  alone,  as  the  constituting  substance  of  matter  (in  the  non- 
relativistic  domain).  Still  talking  of  duality,  i.e.  drawing  a  parallel  between 
a  thing  (particle)  and  one  of  its  many  qualities  (its  occasional  periodic 
probability  distribution  in  space  and  time)  is  illogical. 

The  great  merit  of  Schrodinger's  original  matter  wave  theory  had  been 
that  it  gave  an  explanation  of  the  discreteness  of  quantum  states  in  terms 
of  proper  vibrations  in  a  medium.  But  Born's  statistical  interpretation, 
confirmed  by  the  observation  of  point  events,  destroyed  the  ex- 
planatory character  of  the  Schrodinger  waves,  without  substituting  a 
rational  explanation  for  the  wave-like  phenomena.  The  present  investi- 
gation is  to  fill  this  gap.  The  wave-like  ^-interference  becomes  a  natural 
and  necessary  quality  of  particles  under  the  postulate  that  the  unit  magic 
square  P-tables  arc  connected  by  a  self-consistent  law,  the  only  con- 
ceivable such  law  is  that  of  unitary  transformation,  which  is  identical 
with  that  of  ^-interference  (6).  Furthermore,  the  wave-like  ^-periodicity, 
the  basis  of  all  'quantization',  becomes  a  natural  and  obvious  particle 
quality  under  the  postulates  (a) (b)  for  conjugate  observables  q  and  p. 

Postscript:  The  deduction  on  p.  360  is  inconclusive.  Only  perturbation  theory 
leads  to  the  symmetry  principles. 


364  ALFRED  LAND£ 

Bibliography 

[1]    LAND£,  A.,  The  case  for  indeterminism.  In  'Determinism  and  Freedom',  edited  by 

Sidney  Hook,  New  York  University  Press  (1958),  p.  69. 
[2]    ,  Determinism  versus  continuity  in  modern  science.  Mind,  vol.  67  (1958),  pp. 

174-181. 

[3]    ,  Foundations  of  quantum  theory.  Yale  University  Press,  1955. 

[4]    ,  The  logic  of  quanta.  British  Journal  for  the  Philosophy  of  Science,  vol. 

6  (1956),  pp.  300-320. 
[5]    ,  Non-quantal  foundations  of  quantum  theory.  Philosophy  of  Science,  vol. 

24  (1957),  pp.  309-320. 
[6]     §  Zeitschrift  fur  Physik,  vol.  153   (1959)pp.  389-393. 


Symposium  on  the  Axiomatic  Method 


QUANTENLOGIK  UND  DAS  KOMMUTATIVE  GESETZ 

PASCUAL  JORDAN 

Universitdt  Hamburg,  Hamburg,  Deutschland 

In  bekannter  Weise  arbeitet  die  Quantenmechanik  mil  Operatoren 
oder  Matrizen  ;  und  wenn  wir  uns  die  Grundgedanken  der  Quantenmecha- 
nik klar  machen  wollen,  so  ist  es  empfehlenswert,  daB  wir  die  mathemati- 
schen  Probleme,  die  mit  der  Theorie  unendlicher  Matrizen  zusammen- 
hangen,  ganz  ausschalten.  Wir  haben  es  dann,  mathematisch  gesprochen, 
nur  mit  Algebra  zu  tun. 

Wir  denken  uns  also  ein  quantenphysikalisches  System  (Beispiele 
waren  leicht  zu  nennen),  dessen  meBbare  Eigenschaften  darzustellen  sind 
clurch  die  Matrizen  eines  endlichen  Grades  n,  wobei  die  Matrixelemente 
beliebige  komplexe  Zahlen  sind.  Die  Theorie  lehrt  bekanntlich: 

Jeder  hermitischen  Matrix  A  innerhalb  der  Algebra  dieser  Matrizen 
entspricht  eine  meBbare  GroBe  (anders  gesagt:  eine  mogliche  Struktur 
eines  auf  das  System  anwendbaren  MeBinstrumentes).  Die  Eigenwcrte 
der  Matrix  A  sind  die  moglichen  MeBresultate,  die  sich  bei  Messung  von  A 
ergeben  konnen.  Mathematisch  ist  ja  die  Matrix  A  darstellbar  in  der  Form 

(1) 


wobei  die  ejc  orthogonale  (hermitische)  Idempotente  sind: 
wahrend  die  a#  die  Eigenwerte  von  A  bedeuten.  Die  Aussagc:  Als  MeB- 
ergebnis  an  der  GroBc  A  hat  sich  der  Eigenwert  ai  ergeben,  kann  also 
ersetzt  werden  durch  die  Aussage,  daB  eine  Messung  der  GroBe  e\  fur 
diese  ihren  Eigenwert  1  (und  nicht  ihren  anderen  Eigenwert  0)  ergeben  hat. 
Wir  brauchen  also  nur  von  den  Idempotenten  zu  sprechen. 

Die  idempotente  GroBe  e\t  bei  deren  Messung  der  Eigenwert  1  ge- 
funden  wurde,  sei  insbesonderc  unzerlegbar,  also  nicht  als  Summe  von 
zwei  orthogonalen  Idempotenten  darstellbar.  Dann  werde  nachfolgend 
ein  beliebiges  Idempotent  e'  gemessen.  Wie  groB  ist  die  Wahrscheinlich- 
keit,  daB  wir  fur  e'  den  Wert  1  finden?  Die  Quantenmechanik  (oder  die 
,,statistische  Transformationstheorie")  antwortet: 

w  =  Sp(elet).  (2) 

Mit  Sp  ist  die  Spur  der  Matrix  e\e'  gcmeint. 

365 


366  PASCUAL    JORDAN 

In  dicsen  Formulierungen  ist  der  ganze  grundsatzliche  Inhalt  der 
Quantenmechanik  zusammen  gefaBt. 

Man  kann  aber  der  Sache  eine  andere  Fassung  geben,  welche  mit  der 
soeben  erlauterten  mathematisch  aquivalent  ist.  Wir  betrachten  eine 
projektive  Geometrie  von  n  —  1  Dimension  en,  oder  anders  ausgedriickt, 
wir  betrachten  Einheitsvektoren  in  einem  Raum  von  n  Dimensionen. 
Die  Komponenten  f  #  solcher  Vektoren  sollen  beliebige  komplexe  Zahlen 
sein.  Jedes  unzerlegbare  Idempotent  ist  dann  darstellbar  als  Matrix 

e1  =  (f**fi)  mit  Sp(e')  =  £  |f*|8  =  1.  (3) 


k 


(Mit  f*  bezeichnen  wir  die  Konjugierte  zu  |).  Allgemeiner  besteht  um- 
kehrbar  eindeutige  Zuordnung  zwischen  den  linearen  Scharen  der  be- 
trachteten  Vektoren  (oder  den  linearen  Unterraumen  der  projektiven 
Geometrie)  und  den  hermitischen  Idempotenten  der  friiher  betrachteten 
Matrixalgebra.  Wir  konnen  also,  statt  von  den  Idempotenten  zu  sprechen, 
von  den  zugehorigen  Vektorscharen  sprechen.  Sind  in  (2)  beide  Idempo- 
tente  unzerlegbar,  so  haben  wir  (in  unmittelbar  verstandlicher  Bezeich- 
nungsweise) 

w  =  Sp(eic')  =  12  f^A;!2.  (4) 


Diese  zweite  Formulierungsweise  der  quantenmachanischen  Grund- 
gesetze  lehnt  sich  enger  als  die  andere  an  die  Schrodingersche  ,,Wellen- 
mechanik"  an. 

Es  ist  aber  eine  dritte,  nochmals  anders  aussehende  Formulierungs- 
weise  moglich,  die  von  Birkhoff  und  v.  Neumann  vorgetragen  worden  ist. 
Die  (n  —  l)-dimensionale  projektive  Geometrie  kann  mathematisch  er- 
klart  werden  als  ein  Verband  (,,  lattice")  von  bestimmten  Eigenschaf  ten  ; 
und  die  dadurch  ermoglichte  Einordnung  der  Quantenmechanik  in  die 
mathematische  Theorie  der  Verbande  gibt  uns  einen  iiberraschenden 
neuen  Einblick:  Wir  konnen  danach  den  Ubergang  von  der  klassischen 
Mechanik  zur  Quantenmechanik  —  der  ja  gewohnlich  als  Ubergang  von 
kommutativer  zu  nichtkommutativer  Algebra  der  meBbaren  GroBen  be- 
trachtet  wird  —  auch  als  einen  Ubergang  von  distributiven  Verbdnden  zu 
modular  en  Verbdnden  auffasscn. 

In  der  klassischen  Mechanik  konnen  wir  jede  durch  ein  MeBergebnis 
begrundete  Information  oder  Aussage  iiber  den  Zustand  eines  Systems  so 
ausdriicken,  daB  der  den  Zustand  des  Systems  darstellende  Punkt  im 
Phasenraum  sich  innerhalb  einer  gewissen  Punktmenge  a  des  Phasen- 


QUANTENLOGIK  UND  DAS  KOMMUTATIVE  GESETZ  367 

raums  befindet.  Der  Durchschnitt  a  r\  b  zweier  Punktmengen  im  Phasen- 
raum  entspricht  also  der  Verkniipfung  der  beiden  zugehorigen  Aussagen 
durch  ,,und";  die  Vereinigungsmenge  a  v  b  entspricht  der  Verkniipfung 
beider  Aussagen  durch  ,,oder".  In  dieser  Weisc  ist  unmittelbar  ersicht- 
lich,  daB  die  Gesamtheit  der  moglichen  Aussagen  iiber  den  Zustand  des 
klassischen  mechanischen  Systems  ebenso  wie  die  Teilmengen  einer 
Punktmenge  einen  distributiven  Verband  bilden.  Dabei  entspricht  fcrner 
der  Verneinung  einer  Aussage  der  Ubergang  von  der  Punktmenge  a  zur 
komplementaren  Punktmenge  a. 

Bekanntlich  sprechen  wir  in  der  Mathematik  von  einem  Verband,  wenn 
fur  eine  gewisse  Elementenmenge  a,  b,  ...  Verkniipfungen  r>,  w  definiert 
sind,  welche  assoziativ  und  kommutativ  sind,  und  auBerdem  das  Axiom 

(a  r\  b)  ^  a  —  a  ri  (b  v  a)  =  a  (5) 

erfullen,  aus  welchem  die  Idempotenz  aller  Elemente  fiir  diese  beiden 
Verkniipfungen  folgt: 

a  r\  a  =  a  v  a  =  a.  (6) 

Bestcht  zwischen  zwei  speziellen  Elementen  a,  b  die  Beziehung  a  ^  b  =  a 
(aquivalent  mit  a  v  b  =  b) ,  so  schreiben  wir  auch  aQb',  diese  Beziehung 
des  Enthaltenseins  ist  reflexiv  und  transitiv.  Aus  a  C  b  und  b  £  a  folgt 
a  =  b. 

Man  nennt  bekanntlich  einen  Verband  distributiv,  wenn  er  das  zusatz- 
liche  Axiom 

a  r\  (b  w  c)  =  (a  r\  b)  w  (a  o  c)  (7) 

erfiillt.  Es  gilt  dann  gleichzeitig  auch  das  dazu  duale,  aus  (7)  durch  Ver- 
tauschung  der  Zeichen  r»,  w  entstehende  Gesetz;  eine  Tatsache,  die  man 
z.B.  so  beweisen  kann,  daB  man  (7)  als  aquivalent  mit  folgender  dual- 
symmetrischer  Beziehung  erweist : 

(a  r\b)  v  (a  n  c)  v  (b  r\  c)  =  (a  w  b)  /^  (a  ^  c)  r\  (b  w  c).  (B) 

Endlich  sei  erwahnt,  daB  fiir  den  t)bergang  von  einer  Teilmenge  a  zur 
komplementaren  a  (,,Verneinung")  folgende  Axiome  gelten: 


a  =  a]  a  n  b  =  b  v  d  1 

a  o  a  =  0  =  leere  Menge;  a  v  a  =  1  =  voile  Menge  J 

Betrachten  wir  nun  statt  eines  klassischen  Systems  ein  quantenmecha- 
nisches  (wiederum  mit  endlichem  Grad  n  seiner  Matrixalgebra),  so  treten 


368  PASCUAL   JORDAN 

an  die  Stelle  von  Punktmengen  in  Phasenraum  die  hermit ischen  Idem- 
potenten  oder  die  ihnen  umkehrbar  eindeutig  zugeordneten  linearen 
Unterraume  der  (n  —  1) -dimensional en  projektiven  Geometric.  Diese 
erlauben  ebenfalls  Verkniipfungen  r»,  \j,  namlich  im  Sinne  des  Durch- 
schnitts  a  r\  b  von  a  und  b,  sowie  des  durch  a  und  b  aufgespannten  li- 
nearen Raumes  a  v  b.  Aber  der  damit  definierte  Verband  ist  nicht  mehr 
distributiv,  sondern  erfiillt  statt  dessen  nur  noch  das  schwachere  Dede- 
kindsche  Modular  axiom,  welches  —  um  sogleich  seine  ebenfalls  dual- 
symmetrische  Bedeutung  zu  zeigen  —  folgendermafien  formuliert  werden 
kann : 

(a  r»  b)  v  [c  n  (a  v  b)]  =  [(a  n  b)  v  c]  n  (a  v  b).  (10) 

Man  kanii  diesen  Umstand  nach  Birkhoff '-Neumann  so  ausdriicken,  dafi 
man  von  eincr  Quantenlogik  im  Gegensatz  zu  ciner  klassischen  Logik 
spricht.  Natiirlich  ist  es  Geschmacksache,  ob  man  diese  Bezeichnung  an- 
erkennen  will;  jedoch  ist  sic  jedenfalls  dann  naturgemaf3,  wcnn  man 
unter  ,, Logik"  die  Gesetze  der  moglichen  Verkniipfungen  von  Aussagen 
oder  Informationcn  liber  den  Zustand  eines  physikalischen  Systems  ver- 
stehen  will  —  in  dieser  Auffassungsweise  ist  auch  die  Logik  cine  empiri- 
sche  Wissenschaft,  weil  nur  empirisch  klargestellt  werden  kann,  welche 
Gesamtheit  rnoglicher  Aussagen  zu  eincm  bestimmtcn  physikalischen 
System  hinzugehort.  (Offenbar  ist  es  keineswegs  im  Widcrspeuch  hierzu, 
daO  man  andcrerscits  alle  auf  die  Quantentheorie  bezuglichen  t)ber- 
legungen  unter  alluiniger  Verwendung  der  klassischen,  also  distributivcn 
Logik  formulieren  und  durchfiihren  kann.) 

Es  gibt  auch  in  der  Quantenlogik  eine  Verneinung,  namliche  e  =  I  —  e, 
fur  welche  die  Axiome  (9)  gelten,  wobei  jetzt  0  und  1  als  die  durch  diese 
Zeichcn  bezeichneten  Elemente  der  Matrixalgebra  zu  verstehen  sind. 

Entscheidcnd  fiir  die  Rechtfertigung  der  Birkhoff-Neumannschcn  Be- 
trachtungsweise  ist  aber  folgender  von  Neumann  aufgestellter  mathema- 
tischer  SATZ:  Ein  modularer  Verband,  welcher  einige  zusdtzliche  Eigen- 
schaften  hat  (er  mulJ  irreduzibel  sein,  nur  endliche  Ketten  des  Enthalten- 
seins  zulassen,  und  ,,komplementierbar"  sein),  ist  immer  eine  projektive 
Geometric  endlichcr  Dimension.  ,,Komplementierbar"  ist  er  insbesondere 
dann  —  in  einer  speziclleren  Weise  —  wenn  es  in  ihm  auch  eine  Operation 
der  Verneinung  in  besprochener  Form  gibt.  In  diesem  Falle  hat  der  zu  der 
projektiven  Geometric  gehorige  Schiefkorper  insbesondere  die  Eigen- 
schaft,  welche  ich  mit  dem  Wort  ,, formal  komplex"  bezeichnet  habe.  Soil 
dieser  Schiefkorper  den  reellen  Zahlkorper  in  sich  enthalten,  so  muB  er 


QUANTENLOGIK  UND  DAS  KOMMUTATIVE  GESETZ  369 

entweder  dieser  selbst  sein  oder  der  Korper  der  komplexen  Zahlen  oder 
der  Schiefkorper  der  Quaternionen. 

Ein  reizvoller  Neumannscher  SATZ  besagt  iibrigens,  daB  die  modularen 
Verbande  durch  f olgende  Eigenschaf t  gekennzeichnet  sind :  Gilt  fur  drei 
spezielle  Elemente  a,  bt  c  die  Distributiv-Beziehung  (7),  so  ist  sie  invariant 
gegen  Permutationen  dieser  drei  Elemente.  Weitergehend  kann  man  zeigen 
(Jordan),  daB  dann  der  ganze  durch  a,  b,  c  erzeugte  Teilverband  distributiv 
ist;  und  das  erlaubt  f  olgende  Klarstellung :  Zwei  Idempotente  e,  e'  sind 
genau  dann  vertauschbar,  also  ee'  =  e'e,  wenn  zwischen  e,  e,  e'  eine  Distri- 
butivbeziehung  besteht. 

Nach  diesen  Vorbereitungen  komme  ich  zur  Besprechung  eines  Ge- 
dankens,  der  mich  zcit  langer  Zeit  beschaftigt  hat.  Fiir  die  Weiter- 
entwicklung  der  Quantentheorie  konnte  es  notwendig  werden,  den  grund- 
satzlichen  Formalismus  der  Quantenmechanik,  wie  er  besprochen  wurde, 
zu  erweiteren  oder  zu  verallgemeinern.  Gibt  es  dazu  mathematische 
Moglichkeiten  ? 

Diese  Frage  ist  zunachst  in  der  Weise  untersucht  worden,  daB  Verall- 
gemeinerungen  der  assoziativen  Matrix-Algebren  untersucht  worden 
sind  [1,  2,  4].  Diese  Untersuchungen  haben  AnlaB  zu  einer  ganzen  Reihe 
weiterer  mathematischer  Untersuchungen  gegeben  [5].  Jedoch  soil  diese 
Seite  der  Entwicklung  jetzt  nicht  ausfuhrlicher  besprochen  werden,  da 
sie  trotz  mancher  reizvoller  mathematischer  Ergebnisse  fur  die  Physik 
bislang  nichts  Fruchtbares  ergeben  hat. 

Es  liegt  aber  nahe,  eine  andere  Verallgemeinerungsmoglichkeit  zu 
studieren,  darin  bestehend,  daB  man  innerhalb  der  Quantenlogik  noch 
einmal  den  Ubergang  vom  Kommutativen  zum  Nichtkommutativen  ver- 
sucht.  In  der  Tat  hat  sich  gezeigt,  daB  die  Theorie  der  Verbande  sich 
durch  Verzicht  auf  das  Axiom  der  Kommutativitat  zu  einer  Theorie  der 
,,Schrdgverbande"  (skew  lattices)  verallgemeinern  laBt,  welche  zwar  eine 
Fiille  neuer,  zum  Teil  recht  schwieriger  Fragen  aufwirft,  aber  auch  viele 
schone  Ergebnisse  schon  jetzt  ermoglicht  hat,  von  denen  im  Folgenden 
nur  eine  kurze  Andeutung  gegeben  werden  kann.  Diese  nichtkommutative 
Verallgemeinerung  der  Verbandstheorie  ist  zuerst  von  Klein-Barmen  ins 
Auge  gefaBt,  spater  vom  Verfasser  in  Angriff  genommen,  und  unab- 
hangig  davon  auch  von  Matsushita  (vergleiche  [3]).  Die  Uberlegiungen 
des  Verfassers  sind  durch  die  Mitarbeit  von  E.  Witt  und  W.  Boge  ent- 
scheidend  gefordert  worden. 

Wir  denken  uns  eine  Elementenmenge  mit  zwei  assoziativen  Verkniip- 


370 


PASCUAL    JORDAN 


fungen  v,  A.  Wir  fordern  ferner  als  Grundaxiom 
(a  A  b)  v  a  =  a  A  (v  a)  =  a, 
woraus  auch  jetzt  die  Idempotenz 

a*a  =  ava  —  a 


(12) 


folgt.  Wahrend  aber  in  (7)  das  kommutative  Gcsetz  mannigfache  Um- 
stellungen  dcr  Buchstabcn  zulaBt,  sollen  die  dadurch  entstehenden  For- 
meln  kcineswegs  auch  auf  die  Schragverbande  iibertragen  werden.  Bei- 
spielsweise  wird  das  —  von  (11)  unabhangige  —  zusatzliche  Axiom 


(b  A  a)  v  a  --=  a  A  (a  v  b)  =  a 


(13) 


nur  von  einer  sehr  speziellen,  ziemlich  trivialen  Klasse  von  Schragver- 
banden  erfiillt. 

Es  gibt  nun  in  jedem  Schragverband  vier  Formen  eines  reflexiven  und 
transitiven  Enthaltenseins,  die  im  allgemeinen  verschiedenc  Bcdcutung 
habcn  —  sind  sie  in  cincm  speziellen  Schragverband  alle  vier  gleich- 
bedeutend,  so  ist  dieser  kommutativ,  also  ein  Verband.  Im  allgemeinen 
Falle  kommen  auch  entsprechende  Aquivalenzklassen  von  mehr  als 
eincm  Element  vor.  Die  vier  Formen  dcs  Enthaltenseins  von  a  in  /; 
sind  definiert  durch  die  Beziehungcn: 


(14) 


Jede  Form  des  starken  Enthaltenseins  ergibt  als  Folgerung  die  zitgehorige 
Form  schwachen  Enthaltenseins,  was  wir  so  andcutcn  konncn: 


links 

rechts 

stark 

/;  A  a  —  a 

b  v  a  —  b 

schwach 

a  v  b  —  b 

a  A  b  —-  a 

Das  zusatzliche  Axiom  (13)  ist  dann  gleichbedeutend  damit,  daB  beide 
Formen  schwachen  Enthaltenseins  stets  zugleich  vorliegen: 


(13') 


QUANTENLOGIK  UND  DAS  KOMMUTATIVE  GESETZ  371 

Schwacher  als  (13)  ist  das  Axiom 

a  A  b  A  a  =  a  A  b,] 
avbva  —  bva,  J 

welches  gleichbedeutend  ist  mit  folgenden  Beziehungen  hinsichtlich  des 
Enthaltenseins : 


(15') 


Offenbar  bekommen  wir  (15')  als  cine  Folgerung  aus  (IT)  und  (13'). 

Da  fur  die  Quantcntheoric  nicht  alle  beliebigen  Verbandc,  sondern  nur 
modular e  Verbandc  von  Bedeutung  sind,  so  schcint  folgende  Tatsache 
ermutigend  —  welche  unabhangig  von  physikalischen  Spekulationen  auch 
rein  mathematisch  reizvoll  ist:  Man  kann  den  Begriff  ,,modular"  auf  die 
Schrdgverbdnde  in  einfacher  und  schoner  Weise  ubertragen.  Namlich  in 
Gestalt  des  folgenden  Axioms,  welches  genau  der  Formel  (10)  nachge- 
bildet  ist  —  es  kommt  jetzt  aber  entscheidend  auf  die  Reihenfolge  der 
Zeichen  an,  welche  in  (10)  aufgrund  der  Kommutativitat  weitgehend 
beliebig  war: 

(a  A  b)  v  \c  A  (a  v  b)]  =  [(a  A  b)  v  c]  A  (a  v  b).  (16) 

Die  Analogic  zum  kommutativcn  Fall  bewahrt  sich  dabei  auch  in  fol- 
gendeni  vSinne:  Man  kann  die  Formel  (10)  ersetzen  durch  das  damit 
gleichwertige  Axiom,  daB  x  Q  y  stets  die  Folgcrung 

xv  (cny)  =  (xv  c)  ^  y  (17) 

habcn  soil.  Ganz  entsprechcnd  ist  (16)  aquivalent  mit  folgender  Aussage: 
Ist  x  zweifach  schwach  enthalten  in  y,  so  gilt 

x  v  (c  A  y)  =  (x  v  c)  A  y.  (18) 

Auch  erweist  sich  der  durch  (16)  definierte  Begriff  der  ,,modularen 
Schragverbande"  darin  als  sinnvoll  und  angemessen,  daB  es  tatsachlich 
cine  groBe  Fiille  von  Beispielen  fiir  diesen  Begriff  gibt. 

Zur  Konstruktion  weiter  Klassen  von  Beispielen  von  Schragverbanden 
ist  folgendes  Verfahren  geeignet:  Angenommen,  es  sei  uns  ein  gewisser 
Schragverband  SB  bereits  gegeben ;  es  kann  sich  insbesondere  um  einen 
kommutativen,  also  einen  Verband  handeln.  Wir  wollen  die  Verkniip- 


372  PASCUAL   JORDAN 

fungen  innerhalb  von  28  mit  den  Zeichen  r»,  w  bezeichnen;  dann  aber 
definieren  wir  in  28  neue  Verkniipfungen  A,  v  durch 


a  A  6 


=  /awif  J 
=  a  o  Fb.  \ 


Hierbei  sollen  die  Elemente  fx  bzw.  F#  von  28  gewisse  Funktionen  des 
Elementes  #  e  28  bedeuten ;  und  zwar  mogen  diese  Funktionen  f olgende 
Eigenschaf ten  haben : 


F(a  r\  Fb)  =  Fa 
a  n  Fa  =  a. 


(20) 


is/  rft^  Elementenmenge  28  awcA  zw  Bezug  auf  die  Verkniipfungen  A, 
v  0i«  Schrdgverband. 

Man  kann  Funktionen  mit  den  Eigenschaften  (20)  in  mannigfacher 
Weise  aufstellen,  indem  man  spezielle  Strukturen  zugrunde  legt.  Benutzt 
man  insbesondere  geeignete  Verbande,  so  erhalt  man  Beispiele  von 
Schragverbanden  aufgrund  der  Kenntnis  von  Verbanden. 

Eine  speziellere  Klasse  von  Funktionen  /,  F  erfiillt  die  oberste  Zeile  (20) 
in  der  Form 

f(a  w  b)  =  fa  w  fb, } 


Entsprechendes  ist  fur  Fx  zu  sagen.  Wenn  28  ein  Verband  ist,  so  ergibt 
sich  bei  dieser  spezielleren  Form  (21)  der  Funktionen  /,  F  iibrigens  genau 
dann  Erfullung  des  Axioms  (13),  wenn 

Ffa~2a\     fFaQa  (22) 

ist. 

Denken  wir  uns  jetzt  einen  beliebigen  Verband  933  mit  Elementen  a, 
b,  . . . ,  und  bilden  wir  das  direkte  Produkt  von  28  mit  sich  selbst,  also 
einen  Verband  mit  Elementen,  welche  Paare  (a\t  #2)  von  Elementen  aus  923 
sind.  Daraus  nehmen  wir  den  Unterverband  derjenigen  Elemente,  bei 
denen  a\  £  #2  ist.  In  dem  so  beschriebenen  Verband  28  definieren  wir 
zwecks  Erfullung  von  (21)  und  der  entsprechenden  Beziehungen  fur  Fx: 


QUANTENLOGIK  UND  DAS  KOMMUTATIVE  GESETZ 


373 


Dieses  ganz  spezielle  Beispiel  einer  Klasse  von  Schragverbanden  erfiillt 
iibrigens  auch  das  bemerkenswerte  zusatzliche  Axiom 


(a  A  b)  v  (b  A  a)  =  (b  A  a)  v  (a  A  6), 
(a  v  6)  A  (6  v  a)  =  (b  v  a)  A  (a  v  6), 


(24) 


welches  passend  als  das  Axiom  halbkommutativer  Schragverbande  be- 
zeichnet  werden  kann,  da  es  insbesondere  immer  dann  erfiillt  ist,  wenn 
mindestens  eine  der  beiden  Verkniipfungen  A,  v  kommutativ  ist. 

Die  mit  der  /,  F-Konstruktion  aus  Verbanden  abzuleitenden  Halb- 
verbande  haben  freilich  trotz  ihrer  groBen  Mannigfaltigkeit  eine  ihnen 
gemeinsame  sehr  spezielle  Eigenschaft :  Sie  erfiillen  das  zusatzliche  Axiom 
(eine  Verschartfung  von  (15)): 


avbvc  —  bvavc 


,  1 

.  J 


(25) 


Ein  Beispiel  eines  mod^ilaren  Schragverbandes,  welcher  dieses  Zusatz- 
axiom  (25)  nicht  erfiillt  (wohl  aber  (15)  erfiillt),  ist  durch  folgende  Ver- 
kniipfungstabelle  fiir  die  vier  Elemente  0,  u,  v,  1  des  Schragverbandes  $84 
gegeben,  in  welcher  x  ein  belibiges  Element  von  $84  bezeichnet: 


(26) 


Dieser  Schragverband  $84  ist  fiir  die  Theorie  der  distributiven  Schrag- 
verbande von  ahnlich  grundsatzlicher  Bedeutung,  wie  der  aus  nur  zwei 
Elementen  0,  1  bestehende  Verband  fiir  die  Theorie  der  distributiven  Ver- 
bande.  Man  kann  allerdings  den  Begriff  der  distributiven  Schragver- 
bande auf  mannigfach  verschiedene  Weise  definieren,  derart,  daB  die 
Definition  scharfer  oder  im  Gegenteil  toleranter  gefaBt  wird.  Ein  Beispiel 
eines  Distributivgesetzes  fiir  Schragverbande  ist  f olgendes : 


0   A  X  =  0 

X  V  1    =    f 

U  A  X  =  U 

x  v  u  =  u 

V    A  %  =  V 

x  v  v  =  v 

1    A  X  =  X 

x  v  0  =  x 

c  A  (b  v  a)  =  c  A  [b  v  (c  A  a)], 
[(a  v  c)  A  6]  v  c  =  (a  A  b)  v  c. 


(27) 


Dieses  sehr  tolerante  Distributivgesetz  —  welches  im  kommutativen  Fall 


374  PASCUAL    JORDAN 

mit  (7)  gleichbedeutend  wird  —  wird  durch  umfangreiche  Klassen  von 
Schragverbanden  erfiillt,  insbesondere  auch  durch  284.  Die  oben  nach  der 
/,  F-Konstruktion  mit  (21)  konstruierten  Beispicle  erfiillen,  wenn  fur  393 
ein  distributiver  Verband  genommen  wird,  cbenfalls  (27). 

Viel  scharfere  Distributivgesetze  fur  Schragverbande  bekommt  man 
jedoch  aus  (8).  Wegen  des  kommutativen  Gesetzes  kann  man  (8)  offcnbar 
in  384  verschiedenen  Formen  schreiben,  und  viclleicht  haben  allc  diese 
384  verschiedenen  Schreibweisen  von  (8)  verschiedene  Bedeutung,  wcnn 
sie  mit  Zeichen  A,  v  statt  o,  w  geschrieben  werden. 

Aus  Griinden,  deren  Erlauterung  hier  etwas  zuviel  Raurn  beanspruchen 
wiirde,  kann  man  jedoch  nur  6  von  dicsen  384  Formen  als  vermutlich 
bedeutimgsvoll  ansehen.  Diese  6  Distributivgesetze  sind  nicht  samtlich 
gleichwertig ;  ob  einige  unter  ihnen  gleichwertig  sein  mogcn,  ist  noch 
unentschicden.  Der  durch  (26)  definierte  Schragverband  3&4  erfiillt  alle  6 
Beziehungen,  und  iiberdies  noch  14  weitere,  weil  es  in  884  einige  Ober- 
einstimmungen  gibt,  die  im  kommutativen  Fall  trivial  sind,  aber  im 
nichtkommutativen  Fall  keineswegs.  Diese  Beziehungen  sollen  unten 
zusammengefaBt  werden. 

Zuvor  jedoch  sei  zur  Erlauterung  der  besonderen  Bedeutung  von  3&4 
noch  erwahnt:  Bekanntlich  kann  jeder  distributive  Verband  als  Unter- 
verband  erhalten  werden  aus  eincm  direkten  Produkt,  dessen  Faktoren 
samtlich  dem  aus  zwei  Elementen  bestehenden  Verband  0,  1  entsprechen. 
Analog  kann  eine  weite,  durch  ein  bestimmtes  Konstruktionsverfahren 
definierte  Klasse  von  Schragverbanden  erhalten  werden  durch  Aussonde- 
rung  von  Unterbereichen  aus  solchen  Schragverbanden,  welche  als  direkte 
Produkte  von  direkten  Faktoren  3584  entstehen.  Man  kann  deshalb  diese 
Schragverbande  —  die  also  alle  im  Folgenden  verzeichneten  Eigenschaf- 
ten  von  3534  ebenfalls  besitzen  —  wohl  als  die  im  schdrfsten  Sinne  ,,distri- 
butiven"  Schragverbande  bezeichnen. 

Alle  erwahnten  Feststellungen  —  die  nur  einen  kleinen  Ausschnitt  aus 
umfangreicheren  Ergebnissen  bilden  —  lassen  uns  freilich  noch  immer 
weit  entfcrnt  bleiben  von  dem  mir  vorschwebenden  Ziel,  welches  an- 
gedeutet  werden  konnte  als  die  Konstruktion  von  verallgemeinerten  pro- 
jektiven  Geometrien,  deren  Elemente  nicht  mehr  modulare  Verbande, 
sondern  modulare  Schragverbande  bilden.  Erst  danach  wird  man  beurteilen 
konnen,  ob  die  Theorie  der  Schragverbandc,  abgesehen  da  von,  daf3  sie  ein 
reiz voiles  Gebiet  mathematischer  Untersuchung  zu  ergeben  scheint,  auch 
fiir  die  Physik  forderlich  sein  konnte. 

Fplgende  Zusatz-Axiome  I,  II  werden  durch  3$  4  erfiillt: 


QUANTENLOGIK  UND  DAS  KOMMUTATIVE  GESETZ  375 

I)  Folgende  acht  Polynome  stimmen  uberein: 

(h  A  c)  v  (a  A  b)  v  (a  A  c)  =-  (b  v  a)  A  (c  v  a)  A  (b  v  c) 

=  (a  A  6)  v  (b  A  c)  v  (a  A  c)  =  (b  v  a)  A  (6  v  c)  A  (c  v  a) 

—  (b  A  a)  v  (6  A  c)  v  (0  A  c)  =  (b  v  a)  A  (b  v  c)  A  (a  v  c) 

=  (&  A  c)  v  (6  A  a)  v  (a  A  c)  =  (b  v  a)  A  (a  v  c)  A  (6  v  c). 

II)  Folgende  vier  Polynome  stimmen  uberein  \ 

(a  A  b)  v  (c  A  6)  v  (a  A  c)  —  (c  v  a)  A  (6  v  c)  A  (6  v  a) 


(28) 


(29) 
=  (c  A  b)  v  (a  A  b)  v  (a  A  c)  =  (c  v  a)  A  (b  v  a)  A  (ft  v  c).  ' 


Bibliographic 

[1]     JORDAN,  P.,  Uber  eine  nicht-desarguessche  ebene  pvojektive  Geometric.  Abhand- 

lungcn   aus  dem   Mathcmalischon   Seminar  der   Universitat   Hamburg,   vol. 

16  (1949),  pp.  74-76. 
[2] ,  Zur  Theorie  der  Cayley-Grdssen.  Akadcmic  dcr  Wissenschaften  und  der 

Literatur.  Abhandlungen  der  Mathematisch-Nalurwissen-schaftlichen  Klasse. 

Series  3  (1949),  pp. 
[3]    ,  Die  Theorie  dev  Schrdgverbdnde.  Abhandlungen  aus  dem  Maihcmatischen 

Seminar  der  Universitat  Hamburg,  vol.  21   (1957),  pp.   127-138. 
[4]    ,  J.  v.  NEUMANN  and  E.  WIGNKR,  On  an  algebraic  generalization  of  the 

quantum  mechanical  formalism.  Annals  of  Mathematics,  vol.  35,   (1934),  pp. 

29-64. 
[5]     KOKCTTER,  M.,  Analysis  in  reellen  J ordan-Algebren.  Nachrichten  dcr  Akadcmie 

der  Wissenschaften  in  Gottingen,  Series  Ha,  Nr.  4  (1958),  pp.  67-74. 


Symposium  on  the  Axiomatic  Method 


LOGICAL  STRUCTURE  OF  PHYSICAL  THEORIES 

PAULETTE  F£VRIER 

Henri  Poincave  Institute,  Paris,  France 

At  the  present  time,  the  methodology  of  Theoretical  Physics  is  not  yet 
well  determined  and  clear.  There  are  various  conceptions  of  it  according 
to  the  different  physicists,  and,  for  some  of  them,  the  axiomatisation  of 
the  theories  is  only  a  part  of  the  development  of  Theoretical  Physics. 

Adequacy  is  the  fundamental  notion  from  the  theoretical  physical 
point  of  view.  A  theory  is  adequate  in  a  certain  experimental  domain  if 
the  predictions  provided  by  this  theory  on  the  basis  of  given  experimental 
data  taken  within  this  domain,  agree  with  experiment.  This  reference  to 
experiment  introduces  notions  we  do  not  meet  about  mathematical 
theories. 

Let  us  consider  a  part  of  a  physical  theory  which  is  axiomatised  in  a 
suitable  way.  Let  us  leave  aside  the  physical  meaning  of  the  terms  used 
in  this  theory.  Then  we  get  a  certain  mathematical  theory.  This  theory 
possesses  a  structure  (in  Bourbaki's  meaning) ;  we  shall  call  it  the  formal 
structure  of  the  part  of  the  physical  theory  under  consideration. 

If  we  have  been  able  to  axiomatise  the  whole  theory,  it  receives  a 
formal  structure  in  the  meaning  above.  But  one  should  not  forget  that, 
in  a  physical  theory,  the  terms  must  have  a  physical  meaning,  which  is 
nothing  else  but  an  intuitive  meaning.  This  meaning  being  left  apart, 
one  would  have  only  a  formal  model  left  which  would  loose  its  interest  for 
the  physicist.  That  is  why  the  meaning  must  always  be  taken  into  account 
together  with  the  structure. 

Many  authors  worked  out  more  or  less  precise  axiomatisations  of  wave 
mechanics  or  quantum  theories,  every  one  of  which  has  its  advantages 
and  drawbacks,  but  an  axiomatisation  which  would  be  in  the  same  time 
completely  satisfactory  and  adequate,  seems  not  yet  to  have  been 
proposed.  I  mean  that,  independently  of  the  difficulties  regarding  the 
formal  expression,  which  we  shall  leave  aside  as  if  they  were  resolved, 
an  axiomatisation  must  not  allow  any  example  of  inadequacy,  i.e. 
a  physical  case  which  should  be  described  by  the  axiomatised  theory 
and  yet  escapes  this  description.  One  could  give  some  examples  of 

376 


LOGICAL  STRUCTURE  OF  PHYSICAL  THEORIES        '          377 

particulars  cases  of  inadequacy  that  have  been  mentioned  about  certain 
attempts  at  axiomatisation  of  waves  mechanics : 

a)  certain  axiomatic  systems  show  a  lack  of  adequacy  because  of  such 
a  potential  V  that  the  hamiltonian  operator  of  the  Schrodinger's  equation 
does  not  possess  any  longer  the  general  properties  which  are  required  of 
the  operators  associated  with  physical  observables. 

b)  on  the  other  hand  it  may  be  useful  to  examine  whether  some  other 
axiomatic  systems  have  not  to  be  modified  or  improved  because  of  the 
spectra  which  have  points  of  accumulation  (For  instance,  see  Colmez' 
paper  [3]). 

In  spite  of  these  difficulties,  it  is  possible  to  realise  axiomatisations  of 
some  parts  of  a  physical  theory,  but  the  main  problem  from  a  theoretical 
physical  point  of  view  is  to  come  to  a  better  theory  rather  than  to  a  perfect 
axiomatisation  of  a  given  one.  It  is  a  matter  of  fact  that  a  theory  which 
would  interest  a  physicist,  is  never  completely  built  up,  presents  some 
defects,  whereas  a  well-shaped  theory  in  some  way  achieved  does  not 
attract  him  any  longer,  probably  because,  when  this  stage  is  reached,  he 
already  runs  towards  some  other  new  growing  theory.  Processes  of 
formation  of  new  theories,  that  is  the  interest  of  the  physicist,  whereas  the 
logician  and  the  mathematician  care  for  formal  achievement. 

Every  physical  theory  holds  in  a  limited  experimental  domain;  the 
problem  which  is  always  before  the  physicist  is  to  find  new  conceptions 
leading  him  to  a  new  theory  adequate  to  the  experimental  data  un- 
accounted for  by  the  preceeding  ones. 

Hence,  if  physico-logical  studies  can  be  useful  for  the  physicist,  they 
will  be  useful  provided  they  are  applied  to  a  theory  not  completely  achieved 
but  still  in  the  course  of  its  development.  When  once  the  theory  is  quite 
built  up,  its  inadequacies  appear,  the  boundaries  of  its  experimental 
domain  are  known,  and  the  physicist  turns  himself  towards  the  building 
of  a  new  better  theory.  That  is  the  reason  why,  from  the  special  stand- 
point of  the  physicist,  it  may  be  more  useful  to  elaborate  physicological 
considerations  in  order  to  help  his  attempts  at  new  theories,  than  to  try 
to  provide,  for  an  achieved  theory,  a  satisfactory  axiomatisation  in  the 
most  strict  sense  of  the  term. 

However,  the  properly  axiomatic  enquiries  about  a  given  physical 
theory  are  necessary,  not  only  from  a  formal  point  of  view,  but  also  from 
the  point  of  view  of  the  theory  of  knowledge. 


378  PAULETTE    FEVRIER 

Whatever  point  of  view  we  adopt,  it  seems  to  me  that  the  first  difficulty 
which  rises  is  to  determine  exactly  what  we  mean  by  a  physical  theory 
and  by  a  satisfactory  axiomatisation  of  a  physical  theory.  What  does  the 
physicist  intend  when  he  tries  to  elaborate  a  physical  theory •?  This  question 
appears  as  very  important  because,  if  we  look  at  the  considerations  I 
mentioned  before,  we  can  find  that  they  are  not  all  related  to  the  same 
meaning  of  the  idea  of  a  physical  theory.  It  seems  to  me  that  such  a 
question  can  be  answered  in  three  quite  different  ways: 

1)  the  aim  of  the  physicist  when  he  makes  a  new  theory  can  be  only  to 
find  new  results,  that  is  to  build  up,  at  any  rate  and  by  any  means,  a 
theory  which  enables  him  to  predict  some  new  experimental  datum ; 

2)  the  aim  of  a  physical  theory  can  be  to  provide  what  we  call  an  ex- 
planation of  physical  reality,  that  is  a  formal  construction,  adequate  to 
the  experimental  data,  which  connects  them  in  a  satisfactory  rational 
way.  Presently,  what  we  should  call  a  "rational  way"  of  building  ex- 
planations means  a  deductive  way,  according  to  the  axiomatic  method.  This 
conception  of  a  physical  theory  does  not  exclude  the  research  for  adequate 
predictions,  but  puts  the  formal  requirements  in  first  place ; 

3)  the  aim  of  a  physical  theory  can  also  be  a  description  of  physical 
reality  in  the  sense  of  a  connexion  between  the  set  of  experimental  data, 
and  some  principles  and  notions  which  are  intuitively  considered  as  funda- 
mental in  something  like  a  "Weltanschauung".  These  primitive  elements 
must  lead  deductively  to  statements  verified  by  experiment  and  they  aim 
also  to  supply  adequate  predictions,  but  they  are  chosen  first  with 
respect  to  their  fundamental  role  in  the  description.  They  rise  from 
various  previous  considerations  which  form,  according  to  Destouches' 
expression,  an  "inductive  synthesis"  a,  p.  86;  5b,  vol.  I,  p.  114. 

Several  axiomatic  systems  can  be  set  up  with  respect  to  the  same 
experimental  domain ;  according  to  the  second  meaning  among  a  physical 
theory,  the  best  theories  are  the  most  suitable  among  these  various  axiomatic 
systems  whith  respect  to  the  requirements  of  the  axiomatic  method. 
According  to  the  third  meaning  of  a  physical  theory,  the  best  theory  is 
not  necessarily  the  best  system  from  an  axiomatic  point  of  view,  but  that, 
among  the  various  systems,  which  depends  on  the  most  fundamental 
notions  in  an  heuristic  sense. 

If  we  come  back  to  the  first  of  the  three  preceeding  meanings,  we  see 
that  it  has  to  be  considered  as  a  minimum  requirement  with  respect  to 
the  question:  what  is  a  physical  theory?  Indeed,  it  founds  a  physical 


LOGICAL  STRUCTURE  OF  PHYSICAL  THEORIES  379 

theory  on  the  single  purpose  of  calculating  predictions  for  future  ex- 
perimental data,  starting  from  initial  experimental  data. 

Taking  this  minimum  requirement  as  a  primitive  assumption,  one  can 
build  up  the  most  general  physical  theory,  that  is  a  general  theory  of 
predictions  [56,  vol.  II,  pp.  505-654,  vol.  Ill,  705-742;  5c].  A  summarization 
of  this  theory  has  been  given  by  Destouchesin  his  lecture  It  aims  to  be  a 
frame  in  which  will  enter  any  physical  theory,  I  mean  it  aims  to  point 
out  what,  in  any  physical  theory,  is  involved  in  the  particular  initial  pur- 
pose of  calculating  predictions. 

This  general  theory  of  predictions  is,  by  definition,  a  physical  theory 
in  the  first  sense  of  the  term,  but,  according  to  the  second  one,  it  can  be 
axiomatised.  According  to  the  third  meaning,  the  general  theory  of 
predictions  points  to  the  purpose  of  calculating  predictions  as  one  of  the 
most  fundamental  notions  in  a  physical  theory,  if  we  consider  a  physical 
theory  as  an  attempt  made  by  a  physicist  in  order  to  provide  a  "Welt- 
anschauung" adequate,  not  only  to  the  experimental  data  already 
known,  but  also  to  future  experimental  data.  However,  a  physicist  who 
assumes  the  third  meaning  of  a  physical  theory  requires  more  than  the 
single  idea  of  adequate  prediction  to  set  up  a  particular  physical  theory. 

The  physico-logical  studies  do  not  restrict  themselves  to  one  of  these 
three  points  of  view.  That  is  why  they  are  not  formal  on  the  whole,  though 
many  parts  of  them  can  be  formalised.  They  do  not  pretend  to  be  more 
than  a  help,  as  well  for  the  approaches  of  the  physicist  as  for  those  of  the 
metc'imathcmatician  or  of  the  philosopher. 

The  way  in  which  physico-logical  studies  may  contribute  to  elaborate 
physical  theories  is  the  following:  in  order  to  satisfy  a  certain  physical 
condition  by  means  of  a  theory  that  we  try  to  elaborate,  physico-logical 
considerations  can  supply  the  theoretical  requirements  to  be  fulfilled  by 
this  theory.  When  the  physical  theory  must  satisfy  several  physical 
conditions,  some  of  them  being  in  contradiction,  physico-logical  consider- 
ations permit  us  to  reduce  the  contradictions,  and  to  establish  which 
elements  of  the  theory  remain  to  be  determined,  in  order  to  achieve  it. 

I  shall  try  now  to  give  some  examples  of  physico-logical  enquiries,  in 
the  sense  explained  above. 

The  first  task  to  set  up  a  physical  theory  is  to  elaborate  schemes  of  the 
concrete  physical  operations  as:  making  a  measurement,  reading  the 
result  of  a  measurement,  etc. .  .  .  For  example,  from  a  schematic  point  of 


380  PAULETTE   FEVRIER 

view,  a  measurement  which  is  supposed  but  not  effectively  realised  can 
be  assimilated  to  an  affective  measurement ;  we  can  also  assimilate  a  result 
of  measurement  to  a  prediction  for  the  very  instant  when  the  measure- 
ment is  effected. 

Further,  we  have  to  represent  such  schemes  by  suitable  mathematical 
entities,  as  sets,  elements,  etc.  In  that  way,  let  us  consider  more  thorough- 
ly the  initial  assumption  on  which  the  general  theory  of  predictions  is 
based.  We  have  to  make  precise  what  we  mean  by  an  "initial  datum"  and 
by  a  "prediction  statement",  and,  more  generally,  to  determine  what  are 
the  various  kinds  of  statements  which  have  to  be  taken  into  account  in  a 
physical  theory. 

Measurements  are  classified  into  types  called  observables,  and  deter- 
mined by  various  experimental  processes.  One  observable  is  represented 
by  an  element  of  a  set  called  set  of  the  observables.  An  experimental  datum 
is  read  by  the  observer  on  the  dial  or  the  scale  of  a  measuring  apparatus 
(for  example,  it  is  the  position  of  the  spot  in  a  galvanometer).  This 
position  is  not  infinitely  precise.  In  a  schematic  way,  we  can  represent  it 
by  an  intervall  E  with  rational  ends  on  a  straight  line  or  a  circle.  Its 
extent  is  appreciated  by  the  observer  on  the  basis  of  all  that  he  knows 
about  the  precision  of  his  apparatus.  For  instance,  the  precision  is  ob- 
tained by  repeating  a  particular  measurement  a  rather  large  number  of 
times  (we  know  that  its  results  might  be  always  the  same),  and  by 
calculating  the  standard  deviation  of  the  various  numbers  obtained  in 
this  way.  The  weakest  assumption  we  can  make  about  it  is  to  assume  that 
the  result  of  the  measurement  is  this  very  interval  E  and  not  a  special 


E 
M    •    '    »    »  E2 


Fig.  1 

point,  determined  but  unknown,  inside  that  interval.  In  the  case  of  ma- 
croscopic theories  we  can  admit  that  there  is  in  E  one  point  which  is  the 
real  result  of  the  measurement,  but  in  microphysics  such  an  assumption 
cannot  be  made. 


LOGICAL  STRUCTURE  OF  PHYSICAL  THEORIES  381 

Instead  of  an  apparatus  with  only  one  dial  we  may  have  an  apparatus 
with  several  dials,  which  can  be  linear  or  circular.  For  every  reading  of 
a  dial  we  shall  have  an  interval  on  a  straight  line  of  on  a  circle.  In  order 
to  get  the  complete  result  of  the  measurement,  which  is  an  w-interval 
with  rational  ends,  all  the  dials  must  be  taken  into  account  at  the  same 
time,  and  then  the  result  is  represented  in  the  cartesian  product  of  the 
curves  on  which  have  been  represented  the  results  taken  from  every  dial. 
This  cartesian  product  makes  up  a  space  called  "observational  space  of 
the  observable  A",  denoted  by  (RA),  and  such  a  space  is  associated  with 
each  observable  [1]  (see  figure  1). 

A  sentence  which  states  an  experimental  datum  is  an  empirical  sentence 
such  as 

at  t0>  Re  Mes  A  C  EA 

where  tQ  is  an  instant  of  the  observer's  clock,  Re  Mes  A  is  a  certain  set  in 
(RA),  and  EA  a  specified  set  in  (RA)' 

Though  not  any  specified  theory  is  assumed  when  we  begin  to  set  up  a 
general  theory  of  predictions,  however  we  cannot  give  a  physical  meaning 
to  an  empirical  sentence  without  admitting  that  such  a  meaning  is  pro- 
vided to  this  empirical  sentence  by  a  certain  theory,  the  theory  by  means 
of  which  the  experiment  has  been  motivated. 

In  order  to  take  into  account  the  case  of  a  theory  with  a  quantization, 
we  have  to  introduce  the  set  j/o  of  the  possible  values  of  an  observable 
A,  which  is  a  set  in  the  observational  space  (RA).  In  the  particular  case  of 
no  quantization,  as  in  macroscopic  physics, 


Hence,  from  an  empirical  sentence,  we  obtain  a  so-called  experimental 
sentence  by  intersection  of  E  and  J/,  that  is 

at  *o,  Re  Mes  A  C  g  (where  &  =  E  ^  #/). 
In  the  case  of  no  quantization 

JP  T? 

&=£!,. 

As  I  have  already  said,  we  may  in  theoretical  physics  take  under 
consideration  no  only  statements  expressing  facts  effectively  realized, 
but  also  statements  concerning  supposed  facts.  From  a  schematic  point 
of  view  we  can  look  at  these  supposed  data  in  the  same  way  as  the  real 
data. 


382 


PAULETTE    FEVRIER 


Then,  from  an  experimental  sentence  or  a  pair  of  experimental  sen- 
tences about  one  observable  A,  we  can  yield  by  logical  means  new  sen- 
tences which  will  be  also  experimental  sentences.  In  this  way,  we  obtain 
a  calculus  for  the  experimental  sentences  concerning  only  one  observable  A  . 
We  can  then  look  at  this  calculus  as  a  formal  system.  For  example,  we 
can  denote  by 

pi  the  sentence:  at  to,  Re  Mes  A  C  d?Pi 
p2  the  sentence  :  at  /o>  Re  Mes  A  C  $Pz 
and  define 

pi  &  p2  =d  Ke  Mes  A  C  (£Pl  n  ^ 


(logical  product) 


pi  V  p2  =d  Re  Mes  A  C 


'  p  =d  Re  Mes  A  C  (j/  - 


\  or  superposition) 
(Sec  figure  2) 
(negation) 


From  these  definitions  arise  the  rules  of  the  calculus  of  experimental 
sentences  for  one  observable.  When  it  is  formalized,  this  system  is  a 


Fig.  2 


Fig.  3 


language  L\  A  °f  ih>e  experimental  sentences.   It  is  obviously  a  boolean 
algebra  [7 a]. 

Now,  another  case  must  be  examined.  Because  of  a  lack  of  precision 
when  we  read  the  result  of  the  measurement,  it  is  possible  that  we  cannot 
state  if  it  is  the  ^-interval  E\  or  another  w-interv.'il  E%  which  is  the  result 
really  indicated  by  the  dial,  and  we  can  state  only  that  the  result  is:  E\  or 
£2.  In  that  case,  the  datum  is  no  longer  expressed  by  an  experimental 
sentence,  since  the  result  is  no  more  one  set,  but  one  set  or  another  set. 
However,  we  know  something  about  this  result,  which  can  be  expressed 
by  a  kind  of  proposition  of  a  more  general  type  than  the  experimental 
sentence  defined  above,  and  that  we  shall  call  an  experimental  prepo- 
sitional, according  to  a  suggestion  of  Prof.  Beth  (see  figure  3) . 


LOGICAL  STRUCTURE  OF  PHYSICAL  THEORIES  383 

The  prepositional  expressing  that  the  result  of  the  measurement  is 
either  E\  or  E%  will  be  denoted  by  pi  v  p2,  and  every  prepositional  can  be 
obtained  from  experimental  sentences  by  means  of  that  operation  v 
called  mixture  or  strong  logical  sum. 

Experimental  sentences  may  be  considered  as  particular  proposition- 
als.  When  we  formalize  the  system  obtained  in  that  way  we  have  a 
language  LZ,A  of  the  propositionals  for  the  observable  A. 

Now  we  can  take  experimental  sentences  concerning  several  observa- 
bles  A ,  B,  ....  If  these  observables  can  be  observed  at  the  same  time,  we 
form  a  compound  observable  that  I  shall  denote  A  §  B,  and  we  can 
reduce  to  the  case  above  of  one  observable;  §  is  a  binary  operation  which 
is  applied  to  certain  pairs  of  the  set  of  the  observables.  If  there  is  a  single 
pair  of  observables  which  cannot  be  measured  together  (as  in  micro- 
physics),  we  have  to  bring  in  the  calculation  of  predictions,  in  order  to 
be  able  to  describe  that  special  case. 

As  Destouches  explained  in  his  lecture,  the  general  theory  of  predictions 
enables  us  to  point  out  a  correspondence  between  every  experimental 
sentence  and  a  subspacc  ^p  passing  through  the  origin  of  a  vector  space 
(<#).  If  that  space  (&)  would  have  a  finite  number  of  dimensions,  then  all 
observables  would  have  a  finite  spectrum;  hence,  adequacy  requires  that 
the  space  (W)  be  infinite  dimensional.  When  an  operation  applied  to  ex- 
perimental sentences  yields  an  experimental  sentence  p,  a  subspace  Jt p 
corresponds  to  this  sentence  and  the  operation  induces  in  the  space  (®f) 
an  operation  on  the  subspaces. 

Then  the  properties  of  the  sentential  calculus  on  the  experimental 
sentences  will  be  those  of  the  calculus  on  the  associated  subspaces.  The 
study  of  these  properties  enables  us  to  point  out  the  characteristics  of  the 
theory  elaborated  in  order  to  calculate  predictions  for  future  measure- 
ments. Two  cases  are  to  be  considered: 

1)  in  the  case  of  one  observable,  or  of  observables  each  pair  of  which 
can  be  measured  at  the  same  time,  the  set  of  the  associated  subspaces  is  a 
Boolean  algebra.  Hence,  in  a  physical  theory  where  all  observables  can  be 
measured  at  the  same  time,  the  experimental  sentences  follow  the  rules 
of  the  classical  sentential  calculus ; 

2)  in  the  second  case,  there  is  at  least  one  pair  of  observables  which, 
by  right,  cannot  be  measured  at  the  same  time.   "By  right"  means: 
according  to  the  theory.  In  that  case,  the  study  of  the  operations  on  the 
subspaces  associated  with  the  experimental  sentences  points  out  the 


384  PAULETTE    FEVRIER 

characteristics  of  the  corresponding  logical  operations,  in  such  a  way  that 
the  logic  which  is  then  adequate  is  no  longer  the  classical  sentential 
calculus,  but  a  special  logic  LCS  with  the  following  rules  for  the  logical 
negation,  product  and  sum  [1;  7a,  pp.  91-216]: 

NEGATION  :  To  the  negation  -i/>  of  p  corresponds  the  sentence  asserting 
that  the  result  of  the  measurement  is  not  in  &A\  hence  the  associated 
subspace  is  the  complementary  orthogonal  subspace. 

LOGICAL  PRODUCT:  Because  of  the  homomorphism  between  the 
sentential  calculus  and  the  calculus  on  the  subspaces  ~#  ,  with  a  sentence 
r  which  is  the  logical  product  of  p  and  q 

p&q  =  r, 
is  associated  the  intersection  of  the  corresponding  subspaces 


But,  in  the  case  where  p  and  q  relate  to  observables  which,  by  right, 
cannot  be  measured  at  the  same  time,  the  corresponding  subspace  ^p&q 
is  reduced  to  the  point  0.  Hence  the  sentences  of  type  p&q  are  excluded, 
that  is  certain  pairs  of  sentences  are  not  composible. 

LOGICAL  SUM:  The  conjunction  "or"  joining  two  experimental  sen- 
tences can  take  two  different  meanings: 

Superposition  :  We  can  define  a  weak  logical  sitm  p  V  q  with  the  following 
meaning  :  a  measurement  of  A  has  been  effected,  or  a  measurement  of  B, 
or  a  measurement  of  the  compound  observable  A  &  B,  with  such  an 
imprecision  that  we  can  only  assert  that  the  result  of  the  measurement 
belongs  to 

(£A  X  J*B)  "  (S*A  X  *B) 


J&A  and  3#B  being  the  spectra  of  A  and  B.  To  this  operation  corresponds 
the  sum  of  subspaces 


_ 

—  *^  P 


^#pvq  being  the  subspace  spanned  by*rfp  andufg  in  (^). 

In  wave  mechanics,  this  operation  describes  the  notion  of  superposition. 

In  this  way,  the  logical  operations  &,  V,  with  -i,  have  the  same  properties 
as  the  operations  n,  0  and  orthocomplementation  of  the  subspaces 
passing  through  0  of  the  space  (&)  .  Thus  the  sententical  calculus  appears 
as  .  isomorphic  to  an  algebra  of  infinite  dimensional  ortho-complemented 


LOGICAL  STRUCTURE  OF  PHYSICAL  THEORIES  385 

protective  geometry.  It  is  an  ortho-complemented  lattice,  non  modular  in  the 
general  case. 

Mixture:  On  the  other  hand,  the  mixture  of  experimental  sentences 
leads  us  to  a  calculus  of  experimental  propositionals,  in  the  following  way : 

With  the  sentences  p,  q,  we  can  associate  a  sentence  p  v  q  called  strong 
logical  sum.  It  means  that  either  the  observable  A  has  been  measured, 
and  the  result  has  been  found  in  &A,  or  the  observable  B  has  been  meas- 
ured, and  the  result  has  been  found  in  ^B ',  or  both  observables  have  been 
measured  if  they  are  composible,  and  the  result  has  been  found  in  $&  X  &B> 
but  we  do  not  know  which  one  of  these  three  cases  has  been  realized. To 
this  strong  logical  sum  v  corresponds  the  union  of  the  associated  sub- 
spaces,  which  is  not  a  subspace : 

//       //    , .   // 

PVQ  —         P  V ' 

Hence  we  are  lead  to  distinguish  from  the  sentential  calculus  a  calculus 
of  propositionals,  a  propositional  being  a  strong  logical  sum  of  sentences. 
This  is  the  language  £4  of  the  experimental  propositionals,  which  is  a 
distributive  lattice  in  o,  w. 

Now,  we  may  have  to  express  that,  for  instance,  if  p  is  asserted,  then  q 
has  also  to  be  asserted.  That  introduces  a  relation  ->  in  a  language  L$. 
To  the  relation  — >  corresponds  the  relation  of  inclusion  for  the  corre- 
sponding subspaces.  In  the  same  way,  we  shall  have  a  language  L$  for  the 
propositionals;  and  in  that  case,  to  the  symbol  ->•  will  correspond  the 
inclusion  of  the  sets  formed  by  union  of  subspaces. 

At  last,  we  have  a  language  Ly  which  is  the  language  of  the  physical 
theory  under  consideration.  In  order  to  understand  what  is  the  language 
LI,  let  us  take  the  case  of  Newtonian  mechanics:  L?  denotes  what  would 
be  the  formalization  of  Newtonian  mechanics  when  the  initial  conditions 
are  left  free.  (Here  the  experimental  sentences  state  that  the  initial 
conditions  belong  to  a  certain  set  of  the  phase-space).  In  any  physical 
theory,  L?  corresponds  to  the  formalization  of  its  deductive  part. 

We  see  that  the  general  theory  of  predictions  enables  us  to  make 
appearant  the  logical  structure  of  physical  theories,  by  means  of  corre- 
spondence between  certain  subspaces  and  the  various  sets  of  sentences 
used  in  these  theories.  Moreover,  the  general  theory  of  predictions  shows 
thus  that  it  must  be  distinguished  between  two  kinds  of  physical  theories, 
as  Destouches  says  in  his  lecture.  The  calculus  of  experimental  sentences 
of  the  quantum  theories  is  not  a  boolean  algebra  but  an  algebra  of  pro- 


386  PAULETTE   FEVRIER 

jective  geometry,  and  that  is,  in  my  opinion,  the  most  important  charac- 
teristic of  the  structure  of  this  kind  of  theories. 

I  should  like  now  to  give  another  example  of  physico-logical  consider- 
ations about  the  comparison  between  these  two  kinds  of  theories. 

An  historical  exemple  of  this  duality  of  physical  theories  is  given  now 
by  the  opposition  between  the  so  called  classical  probabilistic  inter- 
pretation of  wave  mechanics,  and  the  causal  and  deterministic  interpre- 
tation proposed,  some  years  ago,  by  David  Bohm  [2],  Louis  de  Broglie 
[4]  and  several  other  physicists. 

I  think  that,  from  a  physico-logical  point  of  view,  we  do  not  have  to 
decide  in  favour  of  one  or  of  the  other  kind  of  theory,  because  it  is  pos- 
sible, as  I  shall  try  to  show  now,  to  find  means  of  translating  one  into  the 
other  and  conversely  [7b;  7c]. 

First,  one  can  prove  that  it  is  possible  to  pass  from  a  quantum  pheno- 
menalist  theory  to  a  causal  theory,  provided  a  modification  be  made  of  the 
notion  of  the  physical  system  described. 

Let  5  be  a  source  of  particles,  for  example  an  electrons-gun;  in  wave 
mechanics  in  its  usual  meaning,  the  system  S  that  the  theory  plans  to 
describe  is  one  electron;  in  a  causal  theory,  the  observed  system  is 
determined  only  if  we  decide  which  experimental  apparatus  we  put  after 
the  gun;  for  instance  we  can  put  a  screen  with  one  hole  of  a  given  dia- 
meter; this  measuring  apparatus  a  allows  us  to  know,  with  a  precision 
determined  by  the  diameter  of  the  hole,  the  value  of  the  observable  A 
which  is  the  position  of  the  particle;  the  system  in  observation  is  then 
S/aA-  We  might  put,  instead  of  a  screen  with  a  single  hole,  a  screen  with 
two  holes  (Young's  holes).  Then  we  should  have  a  quite  different  system, 
S/QB,  which  cannot  be  realised  at  the  same  time  as  S/a^.  Thus,  in  a  causal 
description,  in  these  two  cases  we  have  two  different  physical  systems; 
indeed,  the  boundary  conditions  are  different,  the  quantum  potentials 
are  different.  In  both  cases  the  parameters  which  can  be  reached  by  ex- 
periment are  not  the  same.  The  initial  conditions,  in  both  descriptions, 
are  the  same :  they  are  determined  by  the  characteristics  of  the  gun. 

In  the  case  of  the  usual  quantum  theory,  the  studied  system  is,  as  we 
have  seen,  the  particle  5;  what  we  know  about  the  gun  determines  the 
initial  wave ;  if  we  use  the  compound  system-apparatus  S/aA,  we  measure 
the  observable  A  on  S ;  if  we  use  the  compound  system-apparatus  S/djg, 
we  take  a  measurement  on  the  observable  B  on  5.  Thus,  the  observed 
system  is  always  the  same  system  S. 


LOGICAL  STRUCTURE  OF  PHYSICAL  THEORIES  387 

The  rules  of  wave  mechanics  supply  predictions  by  means  of  proba- 
bilities concerning  the  results  which  will  be  obtained.  As  we  have  seen 
before,  the  set  of  the  experimental  sentences  is  a  non-distributive,  and 
generally  non-modular,  lattice.  But  it  admits  boolean  sub-lattices  BA  for 
every  observable  A.  And  such  a  sub-lattice  is  identical  to  the  boolean  sub- 
lattice  of  the  experimental  sentences  concerning  the  system  S/cu  in  the  causal 
description.  This  identity  is  what  makes  possible  the  duality  of  description 
but  the  correspondence  between  the  sentences  of  the  two  types  of  theories 
cannot  be  extended  further  than  the  case  of  the  experimental  sentences  for  a 
complete  observable  in  the  quantum  description.  Indeed,  the  lattice  of 
the  experimental  sentences  of  the  probabilistic  description  is  not  distri- 
butive, while  the  algebra  of  the  experimental  sentences  in  the  causal 
description  is  distributive.  The  correspondence  can  be  extended  only  by 
means  of  probabilities:  to  an  experimental  sentence  expressing  a  maximum 
observation  on  the  system  5  (hence  determining  a  single  initial  function) 
corresponds  a  law  of  probability  for  the  observable  A ,  hence  a  valuation 
of  the  boolean  algebra  BA,  and  a  law  of  repartition  for  the  system  S/cu. 

We  see  then  that  position  plays  a  special  role  in  a  causal  theory; 
indeed,  A  may  be  the  position ;  if  A  is  not  the  position,  we  shall  observe  the 
system  (S  +  a^)/as  where  B  is  an  observable  reducible  to  a  measurement 
of  position ;  thus,  by  changing  the  physical  system  in  observation,  one  can 
always  reduce  to  position  in  a  causal  theory. 

In  this  way,  we  see  that  the  two  kinds  of  theories  are  equivalent  with 
respect  to  a  certain  experimental  domain,  or  set  of  facts.  We  can  pass  from 
one  to  the  other  providing  a  modification  is  made  of  our  conception  of  the 
physical  system  taken  under  consideration. 

Conversely,  it  can  be  shown  that  a  translation  is  possible  from  a  causal 
theory  to  a  probabilistic  one.  Let  us  assume  a  causal  theory  which  supplies 
descriptions  for  the  systems  S/cu,  S/Cte,  etc.  Is  it  possible  to  build  up  a 
probabilistic  theory  supplying  the  same  predictions  as  this  causal  theory 
but  which  would  not  contain  parameters  that  we  cannot  reach  by  ex- 
periment. In  such  a  theory,  we  have  to  take  into  account  only  the  sen- 
tences corresponding  to  parameters  which  can  be  submitted  to  experi- 
ment. Hence  the  construction  of  the  theory  must  be  effected  in  two  steps: 
1 )  to  select,  among  the  initial  sentences  in  the  causal  theory,  and  for  every 
system  S/cu,  S/as,  etc  (S  remaining  the  same),  the  experimental  sen- 
tences and  their  consequences.  That  can  be  made  by  a  logical  process 
using  the  modality  " expenmentable" .  This  process  enables  us  to  show  that 
the  experimental  sentences  of  the  causal  theory  form  a  sub-set  of  the 


388  PAULETTE    FEVRIER 

lattice    of    the    experimental    sentences    of    the    probabilistic    theory. 

2)  Then  we  have  to  join  in  a  single  description  concerning  5  the  partial 
descriptions  corresponding  to  every  system  S/dA,  etc.  That  can  be 
realised ;  and  then  it  is  sufficient  to  identify  the  expressions  of  the  proba- 
bilities computed  according  to  the  causal  theory,  and  those  computed 
according  to  the  general  theory  of  predictions.  Since  every  theory  sup- 
plying predictions  can  be  put  in  the  frame  of  the  general  theory  of  pre- 
dictions, such  an  identification  is  possible. 

Thus,  if  one  has  been  able  to  set  up  an  adequate  causal  theory  in  micro- 
physics,  one  can,  by  eliminating  the  elements  of  the  theory  which  cannot  be 
experimented  upon,  build  up  a  probabilistic  theory,  equivalent  to  the  given 
theory  in  the  following  way :  it  supplies  the  same  predictions  about  future 
measurements.  Such  a  theory  has  the  same  structure  as  the  usual  quantum 
theory,  and  is  essentially  indeterministic.  It  does  not  contain  non-ex- 
perimentable  observables. 

From  these  two  processes  of  translating  one  kind  of  theory  into  the 
other,  we  can  see  that  they  are  not  different  with  respect  to  experimental 
data  or  adequacy.  Their  difference,  in  fact,  concerns  methodological 
assumptions.  //  one  prefers  a  positivistic  approach  to  elaborate  physical 
theories,  then  one  cannot  admit  in  a  theory  physical  entities  which  cannot  be 
experimented  upon,  but  the  price  of  this  is  indeterminism.  On  the  other  hand, 
if  one  cannot  accept  indeterminism,  one  has  to  assume  that  certain  physical 
entities  escape  experimentation. 


Bibliography 

[1]    BIRKHOFF,  G.  and  VON  NEUMANN,  J.,  The  logic  of  quantum  mechanics.  Annals  of 

Mathematics,  vol.  37  (1936),  pp.  823-843. 
[2]    BOHM,  D.,  Suggested  interpretation  of  the  quantum  theory  in  terms  of  "hidden" 

variables.  Physical  Review,  vol.  85  (1952),  pp.  166-193;  vol.  87  (1952),  p.  389; 

vol.  89  (1953),  p.  458. 
[3]    COLMEZ,  J.,  Definition  de  I'operateur  H  de  Schrodinger  pour  I'atome  d'hydogene. 

Annales  scientifiques  de  1'Ecole  Normale  Sup6rieurc,  3eme  S6rie,  vol.  72  (1955), 

pp.  111-149. 
[4]    DE  BROGLIE,  Louis,  a)  Sur  la  possibihtd  d'une  interpretation  causale  et  objective 

de  la  mecanique  ondulatoire.  Comptes-rendus  Acad.  Sciences  Paris,  vol.  234 

(1952),  p.  265. 
•   b)  La  physique  quantique  r ester a-t-elle  indeterministe  ?  Paris,  1953,  VII  -f-  113  pp. 


LOGICAL  STRUCTURE  OF  PHYSICAL  THEORIES  389 

[5]  DESTOUCHES,  J.  L.,  a)  Essai  sur  la  forme  generate  des  theories  physiques.  These 
principale  pour  le  Doctoral  6s  Lettres,  Paris,  1938;  Monographies  math6ma- 
tiques  dc  1'Universite  de  Cluj  (Roumanie),  fasc.  VII  (1938). 

b)  Principes  fondamentaux  de  Physique  theorique.  Paris,  1942,  IV  -f  905  pp. 

c)  Corpuscules  et  systemes  de  corpuscules.  Notions  fondamentales .  Paris,   1941, 
342  pp. 

[7]    FEVRIER,  P.,  a)  La  structure  des  theories  physiques.  Paris,  1951,  XII  -\-  424pp. 

b)  Sur  I' Elimination  des  parametres  caches  dans  une  theorie  physique.  Journale 
de  Physique  et  Radium,  Vol.  14  (1953),  p.  640. 

c)  L' interpretation  physique  de  la  mecanique  ondulatoire  et  des  theories  quantiques. 
Paris,  1956,  VIII  +  216  pp. 


Symposium  on  the  Axiomatic  Method 


PHYSICO-LOGICAL  PROBLEMS 

J.  L.  DESTOUCHES 
Henri  Poincare  Institute,  Paris,  France 

1.  Introduction.  I  call  physico-logical  problems  not  the  purely  logical 
ones,  but  those  in  which  both  logical  conditions  and  some  physical 
interpretation  arise.  About  a  physical  theory  there  are  various  questions 
of  this  kind;  but  these  questions  are  not  yet  studied  in  details  and  we 
have  still  to  detect  and  specify  the  problems  occurring  and  to  build  up 
suitable  methods.  I  shall  try  to  set  up  a  general  survey  of  physico-logical 
problems  and  to  summarize  the  general  theory  of  predictions. 

2.  Formal  considerations.  Let  us  take  a  physical  theory  which  is 
considered  as  complete  by  a  physicist.  We  can,  like  in    the   case   of 
euclidean  geometry,  axiomatize  and  formalize  it,  and  make  about  it  the 
same  formal  enquiries  as  about  a  mathematical  theory.  However,  when 
we  consider  a  modern  physical  theory,  it  is  in  fact  very  difficult  to 
elaborate  a  suitable  axiomatic  system.  Very  often  an  axiomatic  system 
for  a  physical  theory  does  not  cover  all  physical  cases ;  some  exceptional 
case  appears  which  does  not  enter  the  axiomatic  scheme.  Here  I  shall  put 
aside  this  purely  formal  point  of  view,  and  consider  only  physico-logical 
problems. 

3.  The  three  parts  of  a  theory.  First  of  all,  many  people  believe  that  a 
physical  theory  taken  as  a  whole  is  a  deductive  theory,  that  is  a  theory 
based  upon  a  few  primitive  terms  and  postulates  and  then  developed  in  a 
strictly  deductive  way.  But,  in  fact,  things  are  not  so  easy :  we  find  a  mixture 
of  physical  notions  which  have  to  be  clarified  by  degrees ;  and  the  physical 
theory  will  keep  the  imprint  of  the  efforts  which  led  to  its  formation.  I 
have  called  this  first  stage  the  inductive  synthesis  of  the  theory  [4],  which 
bring  us  to  the  axiomatic  part  of  the  theory,  itself  the  second  stage.  Then 
comes  the  deductive  stage,  the  third  one.  But  in  fact,  the  preceeding  de- 
scription is  still  too  easy.  The  primitive  terms  and  the  postulates  are  not 
introduced  all  together  but  progressively.  The  three  stages  are  mixed 
up  with-one-another.  What  is  the  deductive  part  of  a  subtheory  is  at  the 

390 


PHYSICO-LOGICAL  PROBLEMS  391 

same  time  a  piece  of  the  inductive  synthesis  of  a  more  fully  developed 
part  of  the  whole  theory. 

Formalisation  can  only  be  applied  to  the  deductive  side  of  the  theory ; 
in  particular,  the  whole  inductive  synthesis  cannot  be  formalised,  but 
only  some  parts  of  it. 

4.  Adequacy.  In  a  physical  theory,  we  cannot  lose  sight  of  the  physical 
meaning  of  the  terms;  we  shall  therefore  remain  at  the  level  of  intuitive 
semantics;  the  requirement  of  adequacy  to  experiment  dominates  any 
study  about  the  notion  of  a  physical  theory.  Adequacy  consists  in  the  fact 
that  the  predictions  calculated  according  to  the  considered  theory,  are 
not  at  variance  with  experiment.  At  best  a  theory  is  adequate  in  a 
certain  field  called  the  adequacy-domain  of  the  theory  [10;  4c,  pp.  40-69]. 

5.  Search  for  a  new  theory.  The  search  for  a  better  theory  belongs  to  the 
normal  development  of  theoretical  physics.  Physico-logical  considerations 
allow  us  to  find  out  whether  a  new  theory  should  replace  an  older  one; 
and  to  shape  a  theory  better  than  given  theories. 

Processes  of  unification  of  given  theories  can  be  pointed  out,  whether 
these  theories  show  mutual  contradiction  or  not  [5;  4b,  pp.  122-147]. 
When  we  elaborate  a  physical  theory,  we  generally  have  to  take  into 
account  incompatible  conditions.  Various  formal  processes  can  be  used  to 
avoid  the  contradictions,  but  the  difficulty  lies  in  finding  a  formal 
process  appropriate  to  the  physical  requirements. 

6.  Formal  structure.  To  each  physical  theory  (as  well  as  to  each  part  of 
a  physical  theory)  corresponds  a  formal  structure  [3] :  the  structure  of  the 
formal  mathematical  system  in  which  the  theory  is  formulated.  I  shall 
call  this  formal  mathematical  system  the  algorithm  of  the  theory.  When 
we  pass  over  to  a  better  theory,  or  to  the  unification  of  several  theories,  a 
part  of  the  formal  structure  of  the  preceeding  theory  is  maintained  [1]; 
it  helps  us  to  set  up  the  new  theory.  For  instance,  if  the  law  of  con- 
nexions between  observers  remains  the  same  when  we  pass  from  a  theory 
Tho  to  a  theory  Th\,  then  the  geometrical  algorithm  remains  unchanged 
in  the  new  theory  [1;  7],  For  example,  in  classical  mechanics  the  geo- 
metrical algorithm  is  the  vector-calculus  in  the  field  of  real  numbers, 
and  in  wave  mechanics  we  have  as  the  geometrical  algorithm  a  weaker 
algorithm.  It  is  necessarily  a  vector-calculus.  On  the  other  hand  the 
general  theory  of  predictions  implies  that  to  each  observable  corresponds 


392  J.   L.    DESTOUCHES 

a  linear  operator.  So  this  weaker  algorithm  is  a  vector  calculus  on  a  ring 
of  operators. 

Quite  a  large  part  of  wave  mechanics  can  be  obtained  by  this  process. 

7.  General  theory  of  predictions.  A  more  concrete  level  of  the  studies  on 
physical  theories  appears  when  one  takes  into  account  the  fact  that  the 
aims  of  a  physical  theory  are,  at  the  minimum,  to  calculate  predictions 
about  the  results  of  future  measurements,  starting  from  the  results  of 
initial  measurements.  In  that  way,  we  are  led  to  a  general  theory  of  pre- 
dictions which  has  a  great  deal  of  consequences  [6;  11;  lOb,  pp.  91-318]. 
If  an  initial  experimental  datum  obtained  by  an  observer  Ob  about  an 
observable  A  on  the  physical  system  S  at  an  instant  /o  on  his  clock  is 
described  by  a  set  $A  of  the  observational  space  (R^i),  $A  C  (R^)  ;  and  if 
we  are  trying  to  calculate  some  prediction  for  the  result  of  a  measurement 
which  will  be  realised  at  an  instant  t'  by  an  observer  Ob'  (in  the  future)  on 
the  system  S,  this  prediction  will  be  expressed  in  terms  of  a  function  s$, 
the  arguments  of  which  are  of  two  kinds:  1°)  what  we  know:  A,  <£A,  to, 
and  2°)  what  we  predict  :  the  result  of  the  measurement  which  can  be  ob- 
tained at  the  instant  t'  of  the  clock  of  Ob'  by  this  observer  Ob'  and  de- 
scribed by  a  set  &B  of  the  observational  space  (Rs)  of  the  observable  B, 
that  is 

(1)    Prob{ReMes  B  C  <$B  at  t'  by  O&'/ReMes  AC£As.t  t0  by  Ob}  = 


The  problem  of  prediction  is  the  problem  of  the  computation  of  the  *(5- 
functions. 

In  the  most  simple  case  we  have  only  one  initial  measurement  and  we 
consider  only  one  observer  (thus  Ob'  is  the  same  observer  as  Ob).  Here  we 
limit  ourselves  to  this  case. 

8.  Axiomatisation  of  measurement.  That  is  the  intuitive  formulation  of 
the  problem  of  prediction.  We  shall  now  describe  this  problem  in  a  more 
precise  and  more  formal  way.  The  physical  system  shall  be  described  by  a 
constant  S  and  a  measurement  by  the  predicate  Mes  ;  a  measurement  on 
the  system  S  at  time  t0  with  an  apparatus  a  shall  be  described  by 

Mes(a,  S,  /<,) 
this  being  a  primitive  term.  We  admit  now: 


PHYSICO-LOGICAL  PROBLEMS  393 

POSTULATE  1  :  To  each  apparatus  a  corresponds  an  element  A  of  a  set  T 
called  "observable"  or  "type  of  measurement". 

POSTULATE  2:  To  each  measurement  at  time  to  of  type  A  corresponds 
a  set  (OA  which  is  a  subset  of  J#A,t0  called  "spectrum  of  A  at  fa"  and 
$A  =  EI  X  £2  X  ES  x  ...  X  En.  The  Et  are  rational  intervals  of  finite 
sets  or  enumerable  sets. 

In  this  case  we  write 

ReMes(<u,  S,  fo)  £  <$A 

and  this  is  called  an  experimental  sentence.  So  cu,j0  called  spectrum  of  A, 
is  a  subset  of  an  w-dimensional  space  (R^)  called  the  observational  space 
oiA. 

POSTULATE  3:  The  number  n  of  the  sets  $i  depends  only  on  the  type 
of  measurement:  n  —  (p(A}. 

9.  Axiomatisation  of  prediction.  For  an  observable  B,  we  consider  the 
field  E  of  the  probabilisable  (or  measurable)  subsets  of  the  spectrum  ,$#B 
of  B.  We  call  a  probability  for  ReMes(a#,  S,  /)  C  &B  where  &B  e  E  the 
value  of  a  function  ty(£B)  defined  on  E  and  such  that 


1°)    0 

2°)    «p(j/B)  =  1 

3°)    $  is  completely  additive  :  ^(Ztft)  =  Z%(#i)  if  ft  n  ^  =  0  for  all  ij 

4°)  ^5  depends  on  the  measured  observable  A,  the  result  $A  of  the 
measurement,  the  time  tQ  of  this  measurement,  the  observable  B,  the  set 
I^B,  the  time  t  when  this  measurement  shall  be  made,  and  the  system  S. 

POSTULATE  4:  For  a  system  S  there  exists  at  least  one  function  ty 
satisfying  the  conditions  l°-4°. 

For  each  physical  theory  there  is  a  set  of  ^-functions  which  fulfill  the 
conditions  l°-4°  under  the  conditions  fixed  by  the  principles  of  this 
theory.  Conversely,  any  supplementary  condition  on  the  ^-functions 
defines  a  class  of  physical  theories.  Thus  we  have  a  frame  to  discuss 
general  properties  of  a  physical  theory. 

10.  Initial  elements  and  prediction  elements.  To  calculate  the  suitable 
$-f  unctions,  I  proved  [6;  11;  lOb,  pp.  91-318]  that  it  is  possible  to  do  as 
follows  : 


394  J.    L.    DESTOUCHES 

First  the  initial  experimental  data  are  translated  into  an  abstract 
language  in  which  a  set  ^0,4,  ^,«0  of  abstract  elements  called  "initial 
elements"  corresponds  to  the  datum 

(2)  #*<M,^,*O  =  ®(A,  *A,  to,  S)  and  3r0M,/.. 


(In  wave  mechanics,  the  set  &o,A,£A,t9  reduces  to  the  set  of  the  initial 
wave-functions  fyo}  compatible  with  the  result  #A,  and  #*0  is  the  sphere  of 
unit  radius  in  a  Hilbert  space).  Then  an  initial  element  XQ  belonging  to 


is  transformed  at  the  instant  t  into  another  abstract  element  X(/)  called 
the  "prediction  element"  by  a  one-one  transformation  U(t,  to)  such  that 

(3)  X(0  =  U(*,  *o)X0 

(In  wave  mechanics  X(/)  reduces  to  a  wave  function  ip(t)). 

Then  the  probabilities  for  the  result  <£B  for  a  future  measurement  can 
be  calculated  by  a  time-independent  function  F: 

(4)  %(A,  £At  tQ,  Ob\  B,  <?B,  t,  06;  S)  -  F(B,  *B,  X;  S,  Ob). 

Formulas  (2)  and  (3)  are  the  result  of  the  use  of  the  auxiliary-variables- 
method;  (4)  is  a  condition  imposed  on  the  evolution  operator  U. 

The  set  X  of  all  X-elements  can  be  considered  as  a  subset  of  an  abstract 
vector-space  ((3/}. 

If  that  space  (^)  would  have  a  finite  number  of  dimensions,  then  all 
observables  would  have  a  finite  spectrum.  Hence,  adequacy  to  experi- 
ment requires  that  the  space  (^)  be  infinite  dimensional. 

1  1  .  Decomposition  of  a  spectrum.  If  ®  is  a  decomposition  of  the 
B-spectrum  and  ^  an  element  of  ® 


there  exists  at  least  one  X<  in  3C  for  which 

(5)  F(B,*,fX,;SfO&)  =  l 

that  is  an  X$  which  guarantees  that  the  B-experimental  datum  shall  be 


PHYSICO-LOGICAL  PROBLEMS  395 

included  in  f^.  These  Xf  can  be  defined  as  eigenf  unctions  of  a  linear 
operator  in  (^).  Therefore  an  operator  is  associated  with  each  observable 
by  a  formal  process,  (without  any  physical  hypothesis;  the  physical 
content  is  introduced  by  the  analytical  form  of  the  operator  when  this 
form  is  given  explicitly)  [6;  11;  1  Ob,  pp.  91-318]. 

It  is  possible  to  define  an  equivalence  modulo  J5,  5)  in  which 

(6)  X  s  2  *<X<  modB,  3> 

i 

In  many  cases,  with  a  convenient  definition  of  an  abstract  integral  [6h  ; 
18;  19],  when  there  is  a  continuous  part  in  the  spectrum  for  B,  the  sum 

2  CiKi  has  a  limit  when  we  consider  the  set  775)  of  every  decomposition 

i 

of  the  ^-spectrum  ;  in  this  case  we  have 

(7)  X  ==fc(df)X(d#),  mod  B,  77$ 


12.  The  spectral  decomposition  theorem.  In  (6),  ct  is  a  complex  number 
and  there  exists  at  least  one  function  fB*S)  for  which 

(8)  F(B,<?i,X;S,Ob)=fB<$)(c{) 

It  is  possible  to  choose  a  function  /#  independant  of  the  decomposition  ® 
of  the  B-spectrum  [6i,  pp.  529-538]  ;  in  this  case  the  function  fB  must  be  a 
solution  of  the  Cauchy's  equation 


and  fulfills  some  accessory  conditions  like  /(O)  =  0.  If  we  exclude  the 
total  discontinuous  solutions  of  Hamel,  the  only  (continuous)  solutions 
are 

fB(Ci)  =  \Ci\*  and  k  >  0. 

Moreover  there  exists  one  and  only  one  universal  function  /  independent 
of  B  and  3)  when  there  is  a  pair  of  observables  which  are  not  simul- 
taneously measurable  [6i,  pp.  538-540;  lOb,  pp.  221-233],  that  is  a 
unique  value  for  the  constant  k  which  is  the  same  for  all  observables  B. 
In  the  case  where  all  observable  are  simultaneously  measurable  (as  in 
classical  physics)  the  value  of  k  remains  undetermined  under  the  con- 
dition k  >  0. 

The  physical  consequence  of  this  fact  is  that  in  classical  physics,  there 
exist  no  interferences  of  probabilities;   on   the  contrary  in  quantum 


396  J.    L.    DESTOUCHES 

physics,  where  non-simultaneously  measurable  observables  exist,  there 
are  interferences  of  probabilities.  Hence  the  value  of  k  is  an  important 
property  of  a  physical  theory  with  non-simultaneously  measurable 
observables. 

P.  Fevrier  has  proved  [12]  that  the  constant  k  is  equal  to  2,  so  that 
the  following  spectral-decomposition  theorem  is  valid : 

THEOREM  :  In  the  case  where  there  exists  at  least  one  pair  of  non  simul- 
taneously measurable  observables,  the  universal  function  f  is 

f(ct)  =  N2; 

hence  k  =  2. 

In  this  case  where  there  exists  a  non-simultaneous  pair  of  observables, 
it  can  be  proved  that  the  general  formalism  of  predictions  cannot  be 
reduced  to  a  simpler  one.  On  the  contrary,  when  all  observables  are 
simultaneously  measurable,  (in  this  case  the  value  of  k  remains  arbitrary 
under  the  condition  k  >  0,  and  in  particular  we  can  put  k  =  2),  the 
general  formalism  of  prediction  calculus  is  valid,  but  it  can  be  reduced  to 
a  simpler  one,  that  is  a  phase-space  scheme. 

So  there  are  two  types  of  physical  theories,  and  only  two:  in  the  first 
type,  there  is  at  least  one  non-simultaneously  measurable  pair  of  ob- 
servables ;  in  the  second  type,  all  observables  are  simultaneously  measur- 
able. The  classical  theories  are  of  the  last  type,  and  the  quantum  ones  are 
of  the  first  type. 

13.  Miscellaneous  notions.  1°)  A  theory  is  called  objectivistic  if  it  is 
possible  to  eliminate  apparatus  of  measurement  from  the  theoretical 
formulation  of  phenomena.  In  this  case  the  formalism  of  prediction 
calculus  is  reducible  to  a  phase-space-scheme,  and  thus  is  of  the  second 
type. 

On  the  contrary,  a  theory  is  called  subjectivistic  if  an  essential  role  is 
played  by  observers  and  apparatus  of  measurement  in  the  theoretical 
formulation  of  the  phenomena.  This  intuitive  definition  is  interpreted 
formally  as  "the  general  formalism  of  prediction  calculus  for  this  theory  is 
not  reducible".  It  results  from  the  above  that  a  subjectivistic  theory  is  of 
the  first  type;  reciprocally  a  theory  of  the  first  type  is  subjectivistic. 

2°)  An  observable  B  derives  from  an  observable  A  if  it  is  possible  to 
compute  the  value  of  B  at  to  when  the  result  of  a  measurement  of  A '  at  to 
is  known.  A  theory  admits  a  state-observable  if  there  exists  an  observable 


PHYSICOLOGICAL  PROBLEMS  397 

such  that  all  obscrvablcs  derive  from  it.  In  the  other  case,  a  theory  is 
without  state-observables. 

It  can  be  proved  that  if  a  theory  is  without  state-observables  this  theory 
has  at  least  one  pair  of  non  simultaneously  measurable  observables  and  so 
is  of  the  first  type.  It  is  obvious  that  a  theory  with  a  state-observable  has 
all  observables  simultaneously  measurable  and  is  of  the  second  type. 

3°)  An  experimental  datum  is  a  result  of  a  measurement ;  it  depends  on 
the  observed  system  and  on  the  apparatus  of  measurement.  Then  if  an 
experimental  datum  is  an  intrinsic  property  of  the  observed  system,  this 
experimental  datum  is  independent  of  the  apparatus  of  measurement.  If 
all  experimental  data  are  intrinsic  properties  of  the  observed  physical 
system,  then  the  apparatus  of  measurement  does  not  play  an  essential 
role  and  the  theory  is  objectivistic. 

On  the  contrary,  if  the  experimental  data  are  not  intrinsic  properties  of 
the  system,  they  depends  on  the  apparatus  and  they  play  an  essential  role 
in  the  theoretical  description,  so  that  the  theory  is  subjectivistic. 

4°)  An  imprecise  experimental  datum  is  analy sable,  if  it  is  equivalent 
either  to  consider  the  result  &  of  the  measurement  ($  is  a  set,  see  postulate 
2),  or  to  consider  the  result  $\  or  the  result  ^2,  when  £\  w  ^2  =  $ •  In 
other  terms,  a  result  of  a  measurement  is  analysable  if  for  every  pre- 
diction it  is  equivalent  to  consider  the  experimental  sentence  p  corre- 
sponding to  «f ,  or  to  consider  the  logical  sum  p\  v  p%  (where  p\  corre- 
sponds to  <^i  and  P2  to  ^2). 

When  an  imprecise  experimental  datum  is  not  analysable,  it  is  im- 
possible to  attribute  a  precise  but  unknown  value  to  the  measured 
observable.  By  means  of  the  connexion  between  experimental  sentences 
and  closed  linear  manifolds  in  the  space  (°&)  it  can  be  proved  [lOb,  pp. 
156-159,  275-280]  that,  if  the  imprecise  experimental  data  arc  all  ana- 
lysable, then  the  theory  is  objectivistic,  and  if  there  is  some  non-analys- 
able  experimental  datum,  then  the  theory  is  subjectivistic. 

5°)  The  term  by  right  means:  "with  respect  to  the  requirements  of  the 
theory".  On  the  other  hand  in  fact  would  mean:  "with  respect  to  experi- 
ment". 

It  is  very  difficult  to  describe  formally  the  notion  of  complementarity. 
In  order  to  be  complementary,  two  observables  must  be  non-simultane- 
ously  measurable  by  right.  That  condition  can  be  taken  as  a  formal  des- 
cription of  complementarity ;  hence  a  theory  with  complementarity  is  a 
theory  including  non-simultaneously  measurable  observables. 

6°)  A  theory  is  deterministic  by  right  if  there  exists  at  least  one  initial 


398  J.    L.    DESTOUCHES 

element  X0  such  that  from  this  element  X0  it  is  possible  to  predict  with 
certainty  the  value  of  all  observables  at  any  time.  A  theory  is  called 
essentially  indeterministic  if  it  does  not  contain  such  an  X0.  It  can  be 
proved  that  a  subjectivistic  theory  is  essentially  indeterministic,  and  that 
an  essentially  indeterministic  theory  (i.e.  a  theory  with  indeterminism 
by  right)  is  a  subjectivistic  theory  [lOb,  pp.  241-244,  260-284]. 

7°)  In  a  subjectivistic  theory,  it  is  necessary  to  use  an  apparatus  in 
order  to  obtain  some  information  on  the  observed  physical  system,  and 
that  apparatus  cannot  be  eliminated  from  the  theoretical  description. 
Conversely  if  the  use  of  an  apparatus  is  essential  by  right  (and  not  only  in 
tact)  the  theory  is  subjectivistic. 

The  preceeding  notions  can  be  defined  more  precisely;  to  each 
physical  notion  corresponds  a  definite  term  in  the  formal  description  of 
the  physical  facts,  that  is  in  the  formalism  of  the  prediction  calculus; 
such  definitions  bring  in,  in  a  precise  way,  the  properties  pointed  out 
here. 

From  these  definitions  it  follows  that  the  uniqueness  of  the  /-function 
and  the  form  imposed  by  the  spectral  decomposition  theorem  is  a  con- 
sequence of  only  one  of  the  following  assumptions,  and  any  one  of  them 
implies  the  others: 

1)  the  theory  is  a  subjectivistic  one, 

2)  there  is  no  state-observable, 

3)  an  experimental  datum  is  not  an  intrinsic  property  of  the  observed 
physical  system, 

4)  imprecise  experimental  data  cannot  be  analysed, 

5)  there  are  two  observables  not  simultaneously  measurable, 

6)  there  is  some  complementarity, 

7)  there  is  essential  indeterminism, 

8)  by  right  it  is  necessary  to  use  an  apparatus  in  order  to  obtain  some 
information  on  the  observed  physical  system. 

This  last  condition  is  the  most  intuitive  for  microphysics  and  it  can  be 
placed  as  postulate  under  the  form  of  principle  of  observability  [13;  lOb, 
pp.  316-318]. 

On  the  contrary,  if  we  assume  the  negation  of  one  of  the  above  as- 
sumptions, this  implies  the  negation  of  the  others  and  the  prediction 
scheme  reduces  to  a  phase-space  scheme.  These  conditions  have  as  a 
consequence  that  the  observable  physical  systems  can  be  divided  into 
two  classes: 

1)  systems  which  are,  by  right,  directly  observable  by  means  of  the 


PHYSICO-LOGICAL  PROBLEMS  399 

sense  organs  of  the  observers  (i.e.  systems  for  which  all  observables  are 
simultaneously  measurable  by  right). 

2)  systems  which,  by  right,  can  only  be  observed  indirectly  by  means  of 
certain  systems  of  the  preceeding  class  called  "apparatus"  (i.e.  systems  in 
which  there  exists  at  least  one  pair  of  non-simulanteously  measurable 
observables)  . 

14.  The  principle  of  evolution.  In  the  general  formalism  of  our  pre- 
diction calculus,  the  evolution  of  the  observed  physical  system  S  is  de- 
scribed only  by  the  It-evolution  operator.  Any  condition  concerning  the 
evolution  of  S  consists  in  a  condition  assigned  to  U(t,  to). 

To  determine  the  evolution  of  this  U-operator  it  is  natural  to  admit  the 
following  principle  as  a  fundamental  property  for  predictions  [14]:  "If 
during  the  time  interval  (to,  t]  no  measurement  is  realised  on  the  observed 
physical  system  S,  (an  initial  measurement  being  made  at  /o)>  then  the 
prediction  for  an  instant  r  (between  to  and  t)  has  an  effect  upon  the  pre- 
dictions for  the  instant  t,  and  this  for  all  T". 

Any  prediction  for  the  instant  r  is  obtained  from  a  predictional- 
element  X(T)  and  X(r)  =  U(r,  /o)Xo.  A  prediction  for  the  instant  t  is 
calculated  from  the  predictions  for  different  times  between  IQ  and  t.  That 
is,  any  prediction  for  an  instant  r  is  considered  as  an  indication  for 
computing  a  prediction  for  the  instant  t.  A  prediction  for  the  instant  r  is 
computed  from  X(r)  (by  the  spectral  decomposition  theorem);  in  other 
words  this  indication  is  described  by  X(r),  and  the  contribution  of  X(r)  in 
order  to  calculate  X(t)  is  an  element  Y(£,  T)  obtained  as  a  function  of 
X(r),  that  is 

Y(*,T)=8f,(*,T)X(T) 

where  5*  (t>  T)  is  an  operator. 
Considering  n  +  1  instants 

TO  =  ^0>  TI,  T2,    .  .  .,   Ti,    .  .  .,   rn-\j  Tn  =  t 
we  shall  have 


The  process  used  to  define  an  integral  gives  us 


400  J.    L.    DESTOUCHES 

where 

X0(/)=lim3f»(Uo)Xo./lTo 

n->oo 

This  is  a  functional  equation  for  X(/),  we  have 

=  U(f,  to)X0, 


hence 

t 
U(/,  to)  ==  «(*,  fo)  +/SP,  T)U(r,  - 

fo 

with  «(*,  *0)X0  =  X0(0- 

The  equation  for  the  operator  U(t,  to)  has  the  form  of  a  Volterra's 
integral  equation  of  hereditary  process,  but  it  is  an  equation  between 
operators  and  not  an  equation  between  functions. 

If  $i(t,  to)  has  the  properties  of  an  evolution  operator,  it  can  be  inter- 
preted as  the  evolution  operator  of  a  fictive  system  So  called  a  substratum 
for  S.  Also  S  can  be  interpreted  as  a  perturbed  system  and  SQ  as  a  non 
perturbed  system.  The  equation  in  U  can  be  solved  by  a  process  of  suc- 
cessive approximations;  the  first  step  gives  the  usual  perturbation  of 
first  order  and  the  upper  steps  the  perturbations  of  higher  orders  [18]. 

In  the  general  case,  U  is  not  derivable  and  there  is  no  Hamiltonian, 
and  thus  no  wave  equation ;  but  in  many  particular  cases,  U  has  a  time 
derivative  and  obeys  a  differential  equation: 


where  $  is  an  operator  called  the  Hamiltonian. 
We  have 


if  tyi(t,  to)  obeys  an  equation  of  this  form.  We  have  the  wave  equation  if 
Uo  and  $(t,  r)  have  a  time  derivative  and  if  $(t,  r)  tends  to  a  limit  when  r 
tends  to  t.  But  in  general  3  (t,  r)  does  not  tend  to  a  limit  when  r  tends  to  t 
and  there  is  only  the  integral  operatorial  equation  to  describe  the  evo- 
lution of  the  system. 

15.  Experimental  sentences.  The  general  theory  of  predictions  leads  us 
to  single  out  sentences  of  a  special  type:  the  experimental  sentences  on 


PHYSICOLOGICAL  PROBLEMS  401 

which  a  calculus  can  be  defined.  Thus  we  get  an  algebra,  which  plays  an 
important  part  in  the  physical  theories  under  consideration  [15;  lOb, 
pp.  91-215]. 

16.  Search  for  new  theories.  Physico-logical  studies  alone  do  not  allow 
us  to  build  a  new  physical  theory  [16;  4c,  pp.  54-60].  A  new  theory  can 
only  be  obtained  by  thoroughly  deepening  the  meaning  of  the  purely 
physical  notions  of  a  theory.  But  physico-logical  studies  definitely  help 
us.  For  example,  in  the  recent  discussions  about  the  quantum  theories, 
concerning  the  discrepancy  between  the  statistical  interpretation  and  the 
causal  one,  the  physico-logical  considerations  served  to  yield  precise 
answers:  if  we  have  an  essentially  indeterministic  theory,  it  is  always 
possible  to  construct  a  deterministic  theory  which  gives  us  the  same  re- 
sults (i.e.  the  same  predictions  concerning  future  measurements)  under 
the  following  conditions :  i)  the  notion  of  a  physical  system  is  not  the  same 
in  both  theories,  ii)  we  must  add  hidden  parameters,  some  belonging  to 
the  physical  system  (in  the  sense  of  the  indeterministic  theory)  and  some 
to  the  measuring  apparatus;  moreover  some  of  these  hidden  parameters 
are  not  measurable  in  any  way  (they  are  metaphysical  parameters) 
[8;  12c,  pp.  43-100].  Reciprocally  P.  Fevrier  has  proved  [17;  12c,  pp. 
135-150]  that,  if  we  have  a  deterministic  theory  with  hidden  parameters, 
by  eliminating  these  parameters  and  modifying  the  notion  of  a  physical 
system,  we  obtain  an  essentially  indeterministic  theory.  Hence  the  notions 
of  determinism  and  indeterminism  are  not  physical  notions,  properties  of 
nature,  but  are  relative  to  the  theoretical  requirements. 

17.  Various  levels.  Whereas,  in  the  study  of  mathematical  theories,  it  is 
enough  to  distinguish  two  levels:  the  theoretical  one,  and  the  metatheo- 
retical  one,  or  in  other  words,  the  language  and  the  metalanguage,  in  the 
study  of  physical  theories,  we  have  to  distinguish  a  greater  number  of 
levels:  for  instance,  the  language  of  the  experimental  sentences,  the 
language  of  predictions,  the  language  of  the  theory,  the  metalanguage 
[15]. 

1 8.  Various  approaches.  Physico-logical  studies  are  still  little  developed, 
and  many  problems  are  to  be  formulated.  To  end,  I  shall  point  out  the 
main  approaches  as  follows : 

a)  To  study  in  a  strictly  logical  way  a  given  physical  theory  only  taken 
as  a  deductive  theory. 


402  J.   L.    DESTOUCHES 

b)  To  elaborate  general  physico-logical  considerations  when  a  con- 
nexion with  experiment  is  introduced  by  the  notion  of  adequacy. 

c)  To  come  to  more  particular  physico-logical  considerations  when  the 
formal  structure  of  a  theory  is  taken  into  account. 

d)  To  draw  the  consequences  of  the  following  notions:  measurements, 
experimental  statements,  predictions.  That  is  to  say:  to  work  out  the 
general  theory  of  predictions. 

e)  To  study  the  calculus  of  experimental  sentences  and  enter  into 
epitheoretical  considerations  about  the  general  theory  of  predictions. 

/)  In  particular,  the  physico-logical  researches  allow  us  to  separate  in  a 
physical  theory  the  intrinsic  (or  objectivitic)  properties  of  the  physical 
objects,  from  those  which  are  intrinsic  properties  of  the  compound 
object-apparatus,  but  not  of  the  objects  themselves.  Criteria  for  the 
intrinsic  and  extrinsic  properties  have  been  mentioned. 

These  considerations  on  intrinsic  and  non-intrinsic  properties  played 
an  important  part  in  the  recent  developments  of  physical  theories, 
namely  in  the  elaboration  of  the  functional  theory  of  particles.  In  this 
theory,  a  particle  is  no  longer  described  by  a  point,  but  by  a  function  u 
or  a  finite  set  of  functions  «$ .  I  have  not  space  enough  here  to  give  details 
about  this  theory  which  I  have  developed  in  recent  papers  [9J. 

1 9.  Conclusion.  The  modern  physical  theories  involve  such  various  and 
mixed  levels  of  thought  that,  besides  purely  physical,  logical  and  mathe- 
matical considerations,  they  need  intermediate  researches  in  order  to 
connect  together  these  different  kinds  of  developments. 

Physico-logic  is  such  an  intermediate  field,  and  that  is  the  reason  why 
physico-logical  methods  do  not  quite  fulfill  the  formal  conditions  required 
either  from  a  physical  theory  or  from  a  logical  one.  But  one  cannot  hope 
to  surmount  the  present  heavy  difficulties  of  theoretical  physics  only  by 
means  of  the  formal  achievement  of  reasonings.  Adequacy  has  to  be 
realised  first  of  all  by  a  physical  theory  and,  for  that  purpose,  physico- 
logical  studies  can  be  very  helpful  and  set  the  theoretical  developments  in 
their  right  connection  with  experiment.  They  are  presently  in  their  first 
stage,  like  the  studies  about  foundations  of  mathematics  at  their  be- 
ginning; the  formal  achievement  does  not  appear  at  the  beginning,  it 
depends  on  the  efficiency  of  the  methods  under  consideration,  and,  on  the 
other  hand,  their  efficiency  depends  on  their  formal  strictness.  Physico- 
logical  studies  must  be  broadly  developed  in  both  directions,  and  play 
an  important  part  in  the  future. 


PHYSICO-LOGICAL  PROBLEMS  403 

Bibliography 

[1]    AESCHLIMANN,  F.,  a)  Sur  la  persistance  des  structures  geomttriques  dans  le  de- 

veloppement  des  theories  physiques.  Comptes  Rcndus  des  stances  de  l'Acad£mie 

dcs  Sciences  de  Paris,  vol.  232  (1951),  pp.  695-597. 

b)    Recherches  sur  la  notion  de  systeme  physique.  These  de  Doctoral  es-Sciences, 

Paris  1957. 
[2]    and  J.  L.  DESTOUCHES,  L' electromagndtisme  non  lineaire  et  les  photons  en 

theorie  fonctionnelle  des  corpuscules.  Journal  de  Physique  et  le  Radium,  t.  18 

(1957),  p.  632. 
[3]    CAZIN,  M.,  a)  Algorithmes  et  theories  physiques.  Comptes  Rendus  des  s6ances 

de  1'Academie  des  Sciences  de  Paris,  t.  224,  pp.  541-543. 

b)  Algorithmes  et  construction  d'une  theorie  unifiante.  Comptes  Rendus  des 
seances  de  1'Academie  des  Sciences,  t.  224  (1947),  pp.  805-807. 

c)  Persistance  des  structures  formelles  dans  le  developpement  des  theories  phy- 
siques. These  de  Doctorat  Univ.  Paris,  Lettres-Philosophie,  Paris  1947. 

d)  Les  structures  formelles  des  mecaniques  ondulatoires  et  leur  persistance  dans 
les  nouvelles  tentatives  theorique.  These  de  Doctorat  es-Sciences,  Paris  1949. 

1 4]  DESTOUCHES,  J.  L.,  a)  Essai  sur  la  forme  generate  des  theories  physiques.  These 
pnncipale  pour  le  Doctorat  es-Lettres,  Paris  1938.  Monographies  mathe- 
matiqucs  de  rUniversit6  de  Cluj,  fasc.  VII,  Cluj  (Roumaiiie)  1938. 

b)  Principes  fondamentaux  de  Physique  theorique.  Vol.  1,  Paris  1942,  174  -f 
IV  pp. 

c)  Traite  de  physique  theorique  et  de  physique  mathematique ,  t.  I.  Methodologie, 
Notions  geometriques,  vol.  I,  Paris  1953,  228  -|-  XIV  pp. 

[5J    ,  a)  Unite  de  la  physique  theoriques.  Comptes  Rendus  des  s6ances  de  1'Aca- 

demie  des  sciences  de  Paris,  vol.  205  (1947),  pp.  843-845. 
b)   Essai  sur  I' Unite  de  la  physique  theorique.  These  complementaire  pour  le 
Doctorat  es-Lcttres,   Paris   1938;   Bulletin  scientifique  de  1'Ecole  poly- 
technique  cle  Timisoara,  Roumanic  1938. 

|_6] a)  Les  espaces  abstraits  en  Logique  et  la  stabilite  des  propositions.  Bulletin 

de  l'Acad6mie  royale  de  Bclgique  (classe  des  sciences)  5°  ser.,  vol.  XXI  (1935), 
pp.  780-86. 

b)  Le  rdle  de  la  notion  de  stabilite  en  physique.  Bulletin  de  1'Academie  royale 
de  Belgique  (classe  des  sciences)  5°  ser.,  vol.  XXII  (1936),  pp.  525-532. 

c)  Conditions  minima  auxquelles  doit  satisfaire  une  theorie  physique.  Bulletin 
de  l'Acad6mie  royale  de  Belgique  (classe  des  sciences)  5°  se"r.,  vol.  XXIII 
(1937),  pp.  159-165. 

d)  Loi  genemle  devolution  d'un  systeme  physique.  Journal  dc  Physique  et  le 
Radium,  ser.  7,  vol.  7  (1936),  pp.  305-311. 

e)  La  notion  de  grandeur  physique.  Journal  de  Physique  et  le  Radium,  se"r. 
7,  vol.  7  (1936),  pp.  354-360. 

f)  Le  principe  de  Relaiivite  et  la  theorie  gdnerale  de  devolution  d'un  systeme 
physique.  Journal  de  Physique  et  le  Radium,  ser.  7,  vol.  7  (1936),  pp 
427-433. 

g)  Les  previsions  en  physique  theorique.  Communication  au  Congres  inter- 


404  J.    L.    DESTOUCHES 

national  de  Philosophic  des  Sciences,  Octobre  1949,  Actuality's  scientifi- 
ques  et  industrielles  Hermann,  Paris  1949. 
h)   Corpuscules  et  Systemes  de  Corpuscules,  Notions  fondamentales.  Vol.    1, 

Paris  1941,  342pp. 
i)   Principes  fondamentaux  de  physique  iheorique.  Vol.  II,  Paris  1942,  484  + 

VI  pp.;  vol.  Ill,  Paris  1942,  248  +  IV  pp. 

j)    Uber  den  Aussagenkalktil  der  Experimentalaussagen.  Archiv  fiir  mathe- 
matische  Logik  und  Grundlagenforschung,  Heft  2/2-4,  pp.  424-25. 

[7]    ,  Cours  mimcogr.  Faculte"  des  Sciences,  Paris  1957. 

[8]    ,  a)  Sur  V interpretation  physique  de  la  Mecanique  ondulatoire  et  I'hypothese 

des  parametres  caches.  Journal  de  Physique  et  le  Radium,  vol.  13  (1952),  pp. 
pp.  354-358. 

b)    Sur  V interpretation  physique  des  theories  quantiques.  Journal  de  Physique 
et  le  Radium,  vol.  13  (1952),  pp.  385-391. 

[9]    ,  a)  Funktionnelle  Theorie  der  Elementarteilchen.  Vorlesung  Pariser  Uni- 

versitatswoche,  Miinchcn  1955,  pp.  176-183. 

b)  Fonctions  indicatrices  de  spectres.   Journal  de  Physique  et  le  Radium,  vol. 
17  (1956),  p.  475. 

c)  Quantization  in  the  functional  theory  of  particles.  Nuovo  Cimento,  suppl. 
vol.  Ill,  s6r  X  (1956),  pp.  433-468. 

d)  La  quantification  en  theorie  fonctionnelle  des  corpuscules.  Vol.  1,  Paris  1956, 
VI  +  144pp. 

e)  Le  graviton  et  la  gravitation  en  theorie  fonctionnelle  des  corpuscules.  Comptes 
Rendus  des  stances  de  1' Academic  des  Sciences  de  Paris,  vol.  245  (1957), 
pp.  1518-1520. 

f)  La  gravitation  en  theorie  microphysique  non  lineaire.  Journal  de  Physique  et 
le  Radium,  vol.  18  (1957),  p.  642. 

g)  Le  graviton  en  theorie  fonctionnelle  des  corpuscules.  Journal  cle  Physique  et 
le  Radium,  vol.  19  (1958),  pp.  135-139. 

h)    Journal  de  Physique  et  le  Radium,  vol.  19  (1958)  (sous  presse) 
i)    Corpuscules  et  champs  en  theorie  fonctionnelle.  vol.  1,  Paris  1958,  VIII  + 

164pp. 
j)    Les  systemes  de  corpuseules  en  theorie  fonctionnelle  (A.S.L.  Hermann,  Paris 

1958). 

[10]     FEVRIER,  P.,  a)  Recherches  sur  la  structure  des  theories  physiques.  These  Sciences 
Math.  Univers.,  Paris  1945. 

b)  La  structure  des  theories  physiques.  Paris,   1951,  XII  -\-  424  pp. 

c)  Logical  Structure  of  Physical  Theories.  This  volume. 

[H] 1  a)  Determinisme  et  inddterminisme.  Vol.  1,  Paris  1955,  250  pp. 

b)  L' interpretation  physique  de  la  Mecanique  ondulatoire  et  des  theories  quan- 
tiques. Vol.  1,  Paris  1956,  216  pp. 

c)  Determinismo  e  indeterminismo.  Vol.  1,  Mexico  1957,  270  pp. 

[12]    ,  a)  Signification  profonde  du  principe  de  decomposition  spectrale.  Comptes 

Rendus  des  stances  de  rAcad6mie  des  Sciences  de  Paris,  vol.  222  (1946),  pp. 

866-868. 
b)    Sur  I' interpretation  physique  de  la  Mecanique  ondulatoire.  Comptes  Rendus 

des  seances  de  l'Acad6mie  des  Sciences  de  Paris,  vol.  222  (1946),  p.  1087. 


PHYSICO-LOGICAL  PROBLEMS  405 

c)   L' interpretation  physique  de  la  Mtcanique  ondulatoire  et  des  theories  quan- 

tiques.  vol.  1,  Paris  1956,  216  pp. 
[13]    ,  Monde  sensible  et  monde  atomique.  Theoria  (Philosophical  Miscellany 

presented  to  Alf  Nyman),  1949,  pp.  79-88. 
[14]    ,  a)  Sur  la  recherche  de  I' equation  fonctionnelle  devolution  d'un  systeme  en 

Morie  gendrale  des  provisions.  Comptes  Rendus  des  stances  de  I'Acad^mie  des 

Sciences  de  Paris,  Vol.  230  (1950),  pp.  1742-1744. 

b)    Sur  la  notion  de  systeme  physique.  Comptes  Rendus  des  stances  de  l'Acad6- 

mie  des  Sciences  de  Paris,  vol.  233  (1959),  p.  604. 
[15]    ,  La  logique  des  propositions  experimentales.  Actes  du  2°  colloquc  de  Lo- 

gique  math6matique  de  Paris  1952,  Paris  1954,  pp.  115-118. 
[16] ,  a)  Sur  la  notion  d*  adequation  et  le  calcul  minimal  de  Johansson.  Comptes 

Rendus  des  stances  de  l'Acad6mie  des  Sciences  de  Paris,  vol.  224  (1947),  pp. 

545-548. 

b)    Adequation  et  ddveloppement  dialectique  des  theories  physiques.  Comptes 
Rendus  des  s6ances  de  I'Academie  des  Sciences  de  Paris,  vol.  224  (1947), 
pp.  807-810. 
[17]    ,  Sur  I' elimination  des  parametres  cache's  dans  une  theorie  physique.  Journal 

cle  Physique  et  le  Radium,  vol.  14  (1953),  p.  640. 
[18]    GUY,  R.,  a)  Comptes  Rendus  stances  de  l'Acad6mie  des  Sciences  de  Paris, 

1950-1953. 

b)    These  de  Doctorat  es- Sciences  math6matiqucs,  Univ.  Paris  1954. 
[19]    NIKODYM,  O.  M.,  Remarques  sur  les  integrates  de  M.  J.  L.  Destouches  conside- 

rees  dans  sa  thdorie  des  provisions.  Comptes  Rendus  des  seances  de  1'Acad^mie 

des  Sciences  de  Paris,  vol.  225  (1947),  p.  479. 


PART  III 

GENERAL  PROBLEMS  AND  APPLICATIONS 
OF  THE  AXIOMATIC  METHOD 


Symposium  on  the  Axiomatic  Method 


STUDIES  IN  THE  FOUNDATIONS  OF  GENETICS 

J.  H.  WOODGER 

University  of  London,  London,  England 

In  what  follows  a  fragment  of  an  axiom  system  is  offered  —  a  frag- 
ment because  it  is  still  under  construction.  One  of  the  ends  in  view  in 
constructing  this  system  has  been  the  disclosure,  as  far  as  possible,  of  what 
is  being  taken  for  granted  in  current  genetical  theory,  in  other  words  the 
discovery  of  the  hidden  assumptions  of  this  branch  of  biology.  In  the 
following  pages  no  attempt  will  be  made  to  give  a  comprehensive  account 
of  all  the  assumptions  of  this  kind  which  have  so  far  been  unearthed; 
attention  will  be  chiefly  concentrated  on  one  point  —  the  precise  formu- 
lation of  what  is  commonly  called  Mendel's  First  Law,  and  its  formal 
derivation  from  more  general  doctrines,  no  step  being  admitted  only 
because  it  is  commonly  regarded  as  intuitively  obvious.  Mendel's  First 
Law  is  usually  disposed  of  in  a  few  short  sentences  in  text-books  of 
genetics,  and  yet  when  one  attempts  to  formulate  it  quite  explicitly  and 
precisely  a  considerable  wealth  and  complexity  of  hidden  assumptions  is 
revealed.  Another  and  related  topic  which  can  be  dealt  with  by  the 
axiomatic  method  is  the  following.  Modern  genetics  owes  its  origin  to  the 
genius  of  Mendel,  who  first  introduced  the  basic  ideas  and  experimental 
procedures  which  have  been  so  successful.  But  it  is  time  to  inquire  how 
far  the  Mendelian  hypotheses  may  now  be  having  an  inhibiting  effect  by 
restricting  research  to  those  lines  which  conform  to  the  basic  assumptions 
of  Mendel.  It  may  be  profitable  to  inquire  into  those  assumptions  in  order 
to  consider  what  may  happen  if  we  search  for  regions  in  which  they  do 
not  hold.  The  view  is  here  taken  that  the  primary  aim  of  natural  science 
is  discovery.  Theories  are  important  only  in  so  far  as  they  promote 
discovery  by  suggesting  new  lines  of  research,  or  in  so  far  as  they  impose 
an  order  upon  discoveries  already  made.  But  what  constitutes  a  dis- 
covery? This  is  not  an  easy  question  to  answer.  It  would  be  easier  if  we 
could  identify  observation  and  discovery.  But  the  history  of  natural 
science  shows  abundantly  that  such  an  identification  is  impossible. 
Christopher  Columbus  sailed  west  from  Europe  and  returned  with  a 
report  that  he  had  found  land.  What  made  this  a  discovery  was  the  fact 

408 


STUDIES  IN  THE  FOUNDATIONS  OF  GENETICS  409 

that  subsequent  travellers  after  sailing  west  from  Europe  also  returned 
with  reports  which  agreed  with  that  of  Columbus.  If  the  entire  American 
continent  had  quietly  sunk  beneath  the  wave  as  soon  as  Columbus's  back 
was  turned  we  should  not  now  say  that  he  had  discovered  America,  even 
although  he  had  observed  it.  If  an  astronomer  reported  observing  a  new 
comet  during  a  certain  night,  but  nobody  else  did,  and  neither  he  nor 
anybody  else  reported  it  on  subsequent  nights,  we  should  not  say  that  he 
had  made  a  discovery,  we  should  say  that  he  had  made  a  mistake. 
Observations  have  also  been  recorded  which  have  passed  muster  for  a 
time  but  have  finally  been  rejected,  so  that  these  were  not  discoveries. 
Moreover,  there  have  been  observations  (at  least  in  the  biological  sciences) 
which  have  been  ignored  for  nearly  fifty  years  before  they  have  been 
recognized  as  discoveries.  Theories  play  an  important  part  in  deciding 
what  is  a  discovery.  Under  the  influence  of  the  doctrine  of  preformation, 
in  the  early  days  of  embryology,  microscopists  actually  reported  seeing 
little  men  coiled  up  inside  spermatozoa.  Under  the  influence  of  von  Baer's 
germ-layer  theory  the  observations  of  Julia  Platt  on  ecto-mesoderm  in 
the  1890s  were  not  acknowledged  as  discoveries  until  well  into  the 
twentieth  century.  Such  considerations  raise  the  question :  is  Mendelism 
now  having  a  restricting  effect  on  genetical  research? 

The  distinction  between  records  of  observations  and  formulations  of 
discoveries  is  particularly  sharp  in  genetics ;  as  we  see  when  we  attempt  to 
formulate  carefully  Mendel's  observations  on  the  one  hand  and  the  dis- 
coveries attributed  to  him  on  the  other.  It  will  perhaps  make  matters 
clearer  if  we  first  of  all  distinguish  between  accessible  and  inaccessible 
sets.  Accessible  sets  are  those  whose  members  can  be  handled  and  counted 
in  the  way  in  which  Mendel  handled  and  counted  his  tall  and  dwarf  garden 
peas.  Inaccessible  sets,  on  the  other  hand,  are  those  to  which  reference 
is  usually  being  made  when  we  use  the  word  'all'.  The  set  of  all  tall 
garden  peas  is  inaccessible  because  some  of  its  members  are  in  the  remote 
past,  some  are  in  the  (to  us)  inaccessible  future,  and  some  are  in  in- 
accessible places.  No  man  can  know  its  cardinal  number.  But  observation 
records  are  statements  concerning  accessible  sets  and  formulations  of 
discoveries  are  statements  concerning  inaccessible  sets.  The  latter  are 
therefore  hypothetical  in  a  sense  and  for  a  reason  which  does  not  apply 
to  the  former  statements,  But  there  are  other  kinds  of  statements  about 
inaccessible  sets  in  addition  to  'air-statements.  In  fact,  from  the  point 
of  view  of  discoveries,  the  latter  can  be  regarded  as  a  special  case  of  a  more 
general  kind  of  statement,  namely  those  statements  which  give  expression 


410  J.    H.    WOODGER 

to  hypotheses  concerning  the  proportion  of  the  members  of  one  set,  say  X, 
which  belong  to  a  second  set  Y.  When  that  proportion  reaches  unity  we 
have  the  special  case  where  all  Xs  are  Ys.  In  the  system  which  is  given 
in  the  following  pages  the  notation  'pY*  is  used  to  denote  the  set  of  all 
classes  X  which  have  a  proportion  p  of  their  members  belonging  to  Y, 
p  being  a  fraction  such  that  0  <  p  <  1 .  This  notation  can  be  used  in 
connexion  with  both  accessible  and  inaccessible  sets.  In  the  latter  case 
it  is  being  used  to  formulate  statements  which  cannot,  from  the  nature 
of  the  case,  be  known  to  be  true.  Such  a  statement  may  represent  a  leap 
in  the  dark  from  an  observed  proportion  in  an  accessible  set,  or  it  may  be 
reached  deductively  on  theoretical  grounds.  In  either  case  the  continued 
use  of  a  particular  hypothesis  of  this  kind  depends  on  whether  renewed 
observations  continue  to  conform  to  it  or  not.  Statistical  theory  provides 
us  with  tests  of  significance  which  enable  us  to  decide  which  of  two 
hypotheses  concerning  an  inaccessible  set  accords  better  with  a  given  set 
of  observations  made  on  accessible  sub-sets  of  the  said  inaccessible  set. 
In  the  present  article  we  are  not  concerned  with  the  questions  of  testing 
but  with  those  parts  of  genetical  theory  which  are  antecedent  to  directly 
testable  statements.  At  the  same  time  it  must  be  admitted  that  more  is 
assumed  in  the  hypothesis  than  that  a  certain  inaccessible  set  contains  a 
proportion  of  members  of  another  set.  As  observations  take  place  in 
particular  places,  at  particular  times,  must  there  not  be  an  implicit 
reference  to  times  and  places  in  the  hypotheses  concerning  inaccessible 
sets,  if  such  hypotheses  are  to  be  amenable  to  testing  against  observa- 
tions ?  Consider,  for  example,  the  hypothesis  that  half  the  human  children 
at  the  time  of  birth  are  boys.  This  would  be  the  case  if  all  children  born  in 
one  year  were  boys  and  all  in  the  next  year  were  girls,  and  so  on  with 
alternate  years,  provided  the  same  number  of  children  were  born  in  each 
year.  But  clearly  a  more  even  spread  over  shorter  intervals  of  time  is 
intended  by  the  hypothesis.  Again,  there  cannot  be  an  unlimited  time 
reference,  because  according  to  the  doctrine  of  evolution  there  will  have 
been  a  time  when  no  children  were  born,  and  if  the  earth  is  rendered 
uninhabitable  by  radio-activity  a  time  will  come  when  no  more  children 
are  born.  Thus  a  set  which  has  accessible  sub-sets  during  one  epoch  may 
be  wholly  inaccessible  in  another. 

In  what  follows  no  attempt  will  be  made  to  solve  all  these  difficult 
problems;  we  shall  follow  the  usual  custom  in  natural  science  and  ignore 
them.  Attention  will  be  confined  to  the  one  problem  of  formulating 
Mendel's  First  Law.  In  the  English  translation  of  Mendel's  paper  of  1 866, 


STUDIES  IN  THE  FOUNDATIONS  OF  GENETICS  41  1 

which  is  given  in  W.  BATESON'S  Mendel's  Principles  of  Heredity,  Cam- 
bridge 1909,  we  read  (p.  338): 

Since  the  various  constant  forms  are  produced  in  one  plant,  or  even  in  one 
flower  of  a  plant,  the  conclusion  appears  to  be  logical  that  in  the  ovaries  of  the 
hybrids  there  are  as  many  sorts  of  egg  cells,  and  in  the  anthers  as  many  sorts  of 
pollen  cells,  as  there  are  possible  constant  combinations  of  forms,  and  that 
these  egg  and  pollen  cells  agree  in  their  internal  composition  with  those  of  the 
separate  forms. 

In  point  of  fact  it  is  possible  to  demonstrate  theoretically  that  this  hypothesis 
would  fully  suffice  to  account  for  the  development  of  the  hybrids  in  the  sepa- 
rate generations,  if  we  might  at  the  same  time  assume  that  the  various  kinds  of 
egg  and  pollen  cells  were  formed  in  the  hybrids  on  the  average  in  equal  num- 
bers. 

Bateson  adds,  in  a  foot-note  to  the  last  paragraph :  This  and  the  preceding 
paragraph  contain  the  essence  of  the  Mendelian  principles  of  heredity.' 
It  will  be  shown  below  that  much  more  must  be  assumed  than  is  ex- 
plicitly stated  here.  L.  Hogben,  in  Science  for  the  Citizen,  London,  1942,  in 
speaking  of  Mendel's  Second  Law  mentions  the  first  in  the  following 
passage  (p.  982) : 

It  is  not,  however,  a  law  in  the  same  sense  as  Mendel's  First  Law,  of  segregation, 
which  we  have  deduced  above,  for  it  is  only  applicable  in  certain  cases,  and  as 
we  shall  see  later,  the  exceptions  are  of  more  interest  than  the  rule. 

But  surely,  Mendel's  First  Law  is  also  only  applicable  in  certain  cases, 
and  if  this  is  not  generally  recognized  it  is  because  the  law  is  never  so 
formulated  as  to  make  clear  what  those  cases  are.  We  cannot  simply  say 
that  if  we  interbreed  any  hybrids  the  offspring  will  follow  the  same  rules 
as  were  reported  in  Mendel's  experiments  with  garden  peas,  because  it 
would  be  possible  to  quote  counter-examples.  It  is  hoped  that  the  follow- 
ing analysis  will  throw  some  light  on  this  question  and  that  in  this  case 
also  the  exceptions  may  prove  to  be  of  at  least  as  much  theoretical  interest 
as  the  rule.  It  will  be  shown  that  the  condition  referred  to  in  the  second  of 
the  above  two  paragraphs  from  Mendel's  1866  paper  is  neither  necessary 
nor  sufficient  to  enable  us  to  derive  the  relative  frequencies  of  the  kinds 
of  offspring  obtainable  from  the  mating  of  hybrids.  It  is  not  sufficient 
because  it  is  also  necessary  to  assume  (among  other  things)  that  the  union 
of  the  gametes  takes  place  as  random.  It  is  not  necessary  because  if  the 
random  union  of  the  gametes  is  assumed  the  required  frequencies  can  be 
derived  without  the  assumption  of  equal  proportions  of  the  kinds  of 


412  J.    H.    WOODGER 

gametes.  At  the  same  time  it  will  be  seen  that  a  number  of  other  as- 
sumptions are  necessary  which  are  not  usually  mentioned  and  thus  that  a 
good  deal  is  being  taken  for  granted  which  may  not  always  be  justified. 

When  we  are  axiomatizing  we  are  primarily  interested  in  ordering  the 
statements  of  a  theory  by  means  of  the  relation  of  logical  consequence; 
but  where  theories  of  natural  science  are  concerned  we  are  also  interested 
in  another  relation  between  statements,  a  relation  which  I  will  call  the 
relation  of  epistemic  priority.  A  theory  in  natural  science  is  like  an  ice- 
berg —  most  of  it  is  out  of  sight,  and  the  relation  of  epistemic  priority 
holds  between  a  statement  A  and  a  statement  B  when  A  speaks  about 
those  parts  of  the  iceberg  which  are  out  of  water  and  B  about  those  parts 
which  are  out  of  sight;  or  A  speaks  about  parts  which  are  only  a  little 
below  the  surface  and  B  about  parts  which  are  deeper.  In  other  words :  A 
is  less  theoretical,  less  hypothetical,  assumes  less  than  B.  If  A  is  the 
statement 

Macbeth  is  getting  a  view  of  a  dagger 

and  B  is  the  statement 

Macbeth  is  seeing  a  dagger 

then  A  is  epistemically  prior  to  B.  Macbeth  was  in  no  doubt  about  A, 
but  he  was  in  serious  doubt  about  B  and  his  doubts  were  confirmed  when 
he  tried  to  touch  the  dagger  but  failed  to  get  a  feel  of  it.  Again,  if  A  is  the 
statement 

Houses  have  windows  so  that  people  inside  can  see  things 
and  B  is  the  statement 

Houses  have  windows  in  order  to  let  the  light  in 

then  A  is  epistemically  prior  to  B. 

We  not  only  say  that  Columbus  discovered  America,  but  also  that  J.  J. 
Thomson  discovered  electrons.  In  doing  so  we  are  clearly  using  the  word 
'discovered'  in  two  distinct  senses.  What  J.  J.  Thomson  discovered  in  the 
first  sense  was  what  we  may  expect  to  observe  when  an  electrical  discharge 
is  passed  through  a  rarified  gas.  He  then  introduced  the  word  'electron' 
into  the  language  of  physics  in  order  to  formulate  a  hypothesis  from 
which  would  follow  the  generalizations  of  his  discoveries  concerning 
rarified  gases.  It  will  help  to  distinguish  the  two  kinds  of  discoveries  if  we 
call  statements  which  are  generalizations  from  accessible  sets  to  inac- 
cessible sets  inductive  hypotheses,  and  statements  which  are  introduced  in 


STUDIES  IN  THE  FOUNDATIONS  OF  GENETICS  413 

order  to  have  such  hypotheses  among  their  logical  consequences  ex- 
planatory hypotheses.  Then  we  can  say  that  to  every  explanatory  hypo- 
thesis Si  there  is  at  least  one  inductive  hypothesis  £2  such  that  £2  is  a 
consequence  of  Si  (or  of  Si  in  conjunction  with  other  hypotheses)  and  is 
epistemically  prior  to  it.  Were  this  not  so  Si  would  not  be  testable.  But, 
as  we  shall  see  later,  it  is  also  possible  to  have  an  explanatory  hypothesis 
Ss  and  an  inductive  hypothesis  S2,  which  is  not  a  consequence  of  SB 
although  it  is  epistemically  prior  to  it,  "both  of  which  are  consequences  of  the 
same  explanatory  hypothesis  Si.  If  what  you  want  to  say  can  be  expressed 
just  as  well  by  a  statement  A  as  by  a  statement  B  then,  if  A  is  epistemically 
prior  to  B,  it  will  (if  no  other  considerations  are  involved)  be  better  to 
use  A.  In  what  follows  I  shall  try  to  formulate  all  the  statements  con- 
cerned in  the  highest  available  epistemic  priority.  Statements  concerning 
parents  and  offspring  only  are  epistemically  prior  to  statements  which  also 
speak  about  gametes  and  zygotes;  and  statements  about  gametes  and 
zygotes  are  epistemically  prior  to  statements  which  speak  also  about  the 
parts  of  gametes  and  zygotes.  The  further  we  go  from  the  epistemically 
prior  inductive  hypotheses  the  more  we  are  taking  for  granted  and  the 
greater  the  possibility  of  error.  The  following  discussion  of  Mendel's  First 
Law  will  be  in  terms  of  parents,  offspring,  gametes,  zygotes  and  en- 
vironments. 

The  foregoing  remarks  may  now  be  illustrated  by  a  brief  reference  to 
Mendel's  actual  experiments.  Suppose  X  and  Y  are  accessible  sets  of 
parents.  Let  us  denote  the  set  of  all  the  offspring  of  these  parents  which 
develop  in  environments  belonging  to  the  set  E  by 

MX,  Y) 

If  all  members  of  X  resemble  one  another  in  some  respect  (other  than 
merely  all  being  members  of  X)  and  all  members  of  Y  resemble  one 
another  is  some  other  respect  (also  other  than  merely  all  being  members  of 
Y),  so  that  the  respect  in  which  members  of  X  resemble  one  another  is 
distinct  from  that  in  which  members  of  Y  resemble  one  another,  then 
JE(X,  Y)  constitutes  an  accessible  set  of  hybrids.  We  also  need/#2(X,  Y) 
which  is  defined  as  follows: 

JE*(X,  Y)  =  /*(/*(*,  Y),f*(X,  Y)) 

Mendel  experimented  with  seven  pairs  of  mutually  exclusive  accessible 
sets  and  the  hybrids  obtained  by  crossing  them.  It  will  suffice  if  we 
consider  one  pair.  Let  'A'  denote  the  pea  plants  with  which  Mendel  began 


414  J.    H.    WOODGER 

his  experiments  and  which  were  tall  in  the  sense  of  being  about  six  feet 
high  ;  and  let  'C'  denote  the  peas  which  he  used  and  which  were  dwarf  in 
the  sense  of  being  only  about  one  foot  high.  Let  us  use  T'  to  denote  the 
inaccessible  set  of  all  tall  pea  plants  and  'D'  to  denote  the  inaccessible  set 
of  all  dwarf  pea  plants.  Thus  we  have 

A  C  T  and  C  C  D 

let  us  use  'B'  to  denote  the  set  of  all  environments  in  which  Mendel's 
peas  developed.  Mendel  first  tested  his  As  and  Cs  to  discover  whether 
they  bred  true  and  found  that  they  did  because 

/B(A,  A)  C  T  and  /B2(A,  A)  C  T 
/B(C,  C)  C  D  and  /B2(C,  C)  C  D 
He  next  produced  hybrids  and  reported  that 
/B(A,C)CT 


Finally,  he  took  100  of  the  tall  members  of  /B2(A,  C)  and  self  fertilized 
them.  From  28  he  obtained  only  tall  plants  and  from  72  he  obtained  some 
tall  and  some  dwarf.  This  indicated  that  about  one  third  of  the  tall  plants 
of  /B2(A,  C)  were  pure  breeding  tails  like/B(A,  A)  and  two  thirds  were  like 
the  hybrid  tails  or/B(A,  C). 

Closely  similar  results  were  obtained  in  the  other  six  experiments, 
although  the  respects  in  which  the  plants  differed  were  in  those  cases  not 
concerned  with  height  but  with  colour  or  form  of  seed  or  pod  or  the 
position  of  the  flowers  on  the  stem.  In  each  case  the  hybrids  all  resembled 
only  one  of  the  parental  types,  which  Mendel  accordingly  called  the 
dominant  one.  The  parental  type  which  was  not  represented  in  the  first 
hybrid  generation,  but  which  reappeared  in  the  second,  he  called  the 
recessive  one.  Mendel  took  the  average  of  the  seven  experiments  and  sums 
up  as  follows: 

If  now  the  results  of  the  whole  of  the  experiments  be  brought  together,  there  is 
found,  as  between  the  number  of  forms  with  the  dominant  and  recessive  charac- 
ters, an  average  ratio  of  2.98  to  1,  or  3  to  1. 

So  long  as  we  assert  that  the  average  ratio  is  2.98  to  1  we  are  dealing  with 
accessible  sets  and  have  no  law  or  explanatory  hypothesis.  But  what  does 


STUDIES  IN  THE  FOUNDATIONS  OF  GENETICS  415 

Mendel's  addition  'or  3  to  1'  mean?  Presumably  these  few  words  express 
the  leap  from  an  observed  proportion  in  an  accessible  set  to  a  hypotheti- 
cal proportion  in  an  inaccessible  set.  This  represents  Mendel's  discovery  as 
opposed  to  his  observations.  At  the  same  time  there  is  no  proposal  to 
extend  this  beyond  garden  peas.  This  extension  was  done  by  Mendel's 
successors  who,  on  the  basis  of  many  observations,  extended  his  gener- 
alization regarding  the  proportions  of  kinds  of  offspring  of  hybrids  over 
a  wide  range  of  inaccessible  sets  not  only  of  plants  but  also  of  animals.  In 
addition  to  this  Mendel  also  left  us  his  explanatory  hypothesis,  the 
hypothesis  namely  that  the  hybrids  produce  gametes  of  two  kinds  —  one 
resembling  the  gametes  produced  by  the  pure  dominant  parents,  and  the 
other  resembling  those  produced  by  the  recessive  parents.  He  also  assumed 
that  these  two  kinds  of  gametes  were  produced  in  equal  numbers.  We  have 
now  to  consider  what  is  the  minimum  theoretical  basis  for  deriving  this 
hypothesis  as  a  theorem  in  an  axiom  system. 

A  GENETICAL  AXIOM  SYSTEM 

(In  what  follows  the  axiom  system  is  given  in  the  symbolic  notation  of  set- 
theory,  sentential  calculus  and  the  necessary  biological  functors  (the  last  in 
bold-face  type).  Accompanying  this  is  a  running  commentary  in  words  in- 
tended to  assist  the  reading  of  the  system ;  but  it  must  be  understood  that  this 
commentary  forms  no  part  of  the  system  itself.) 

The  following  primitives  suffice  for  the  construction  of  a  genetical 

axiom  system  expressed  on  the  level  of  epistemic  priority  here  adopted; 

for  cyto-genetics  (and  even  perhaps  for  extending  the  present  system) 

additional  primitives  are  necessary. 

(i)  'uFx'  for  'u  is  a  gamete  which  fuses  with  another  gamete  to  form  the 

zygote  (fertilized  egg)  % . 
(ii)  'dlz  xyz'  for  'x  is  a  zygote  which  develops  in  the  environment  y  into 

the  life  z.' 

(iii)  'u  gam  z   for  '«  is  a  gamete  produced  by  the  life  z.' 
(iv)   cJ  is  the  class  of  all  male  gametes, 
(v)  $  is  the  class  of  all  female  gametes, 
(vi)  'phen'  is  an  abbreviation  for  'phenotype' 

The  following  postulates  are  needed  for  the  derivation  of  the  theorems 
which  are  to  follow: 

POSTULATE  1     (u)(v)(x):uFx.vFx.u+v.D.~(3w).wFx.w=\=-u.w^=v 


416  J.    H.    WOODGER 

This  asserts  that  not  more  than  two  gametes  fuse  to  form  each  zygote. 
POSTULATE  2     (x)  (w  )  :  (Eu)  .  uFx  .  uFw  .  D  .  #  —  w 

This  asserts  that  if  a  gamete  unites  with  another  to  form  a  zygote  then 
there  is  no  other  zygote  for  which  this  is  true. 

POSTULATE  3     (u)(v)(x):.uFx.vFx.u^=v.D\u  e  g.v  e  ?.v  .u  E  ?.v  e  cj 

This  asserts  that  of  the  two  gametes  which  unite  to  form  any  zygote  one  is 
a  male  gamete  and  the  other  a  female  gamete. 

POSTULATE  4  c?  r>  $  =  A 

This  asserts  that  no  gamete  is  both  male  and  female. 

POSTULATE  5     (x)  (y)  (z)  (u)  (v)  \dlz  xyz  .  dlz  uvz  .  D  .  x  =  y  .  u  =  v 

This  asserts  that  every  life  develops  in  one  and  only  one  environment  from 
one  and  only  one  zygote. 

POSTULATE  6     (x)  (y)  (z)  (*')(/)  (z')  'dlz  xyz  .  dlz  x'y'z'  . 

.D.x  =  xf 


This  asserts  that  if  there  is  a  gamete  produced  by  a  life  z  and  the  same 
gamete  is  produced  by  a  life  z'  then  the  zygote  from  which  z  develops  is 
identical  with  the  zygote  from  which  z1  develops.  This  may  seem  strange 
until  it  is  explained  that  by  'a  life'  is  here  meant  something  with  a  be- 
ginning and  an  end  in  time  and  a  fixed  time  extent.  The  expression  is  thus 
being  used  in  a  way  somewhat  similar  to  the  way  in  which  it  is  used  in 
connexion  with  lite  insurance.  Suppose  a  zygote  is  formed  at  midnight  on 
a  certain  day  ;  suppose  it  develops  for  say  ten  days  and  on  that  day  death 
occurs.  Then  the  whole  time-extended  object  of  ten  days  duration  'from 
fertilization  to  funeral'  is  a  life  which  is  complete  in  time.  But  suppose  we 
are  only  concerned  with  what  happens  during  the  first  ten  hours',  then 
that  also  is  a  life,  in  the  sense  in  which  the  word  is  here  used,  and  one 
which  is  a  proper  part  of  the  former  one.  Now  if  a  gamete  is  said  to  be 
produced  by  the  shorter  life  it  is  also  produced  by  the  longer  one  of  which 
that  shorter  one  is  a  part  ;  we  cannot  identify  the  two  lives  but  we  can  say 
that  they  both  develop  from  the  same  zygote.  As  here  understood  the 
time-length  of  a  life  fixes  its  environment  ;  because  the  environment  of  a 
life  is  the  sphere  and  its  contents  which  has  the  zygote  from  which  de- 
velopment begins  as  its  centre  and  a  radius  which  is  equal  in  light-years  to 


STUDIES  IN  THE  FOUNDATIONS  OF  GENETICS  417 

the  length  of  the  life  in  years.  But  no  time-metric  is  needed  for  the  present 
system  and  many  complications  are  therefore  avoided. 

All  the  primitive  notions  of  this  system  are  either  relations  between 
individuals  or  are  classes  of  individuals.  But  the  statements  of  genetics 
with  which  we  are  concerned  in  what  follows  do  not  speak  of  individual 
lives,  individual  environments,  individual  zygotes  or  individual  gametes 
but  of  classes  of  individuals  and  of  relations  between  such  classes.  But 
the  classes  we  require  are  definable  by  means  of  the  primitives. 

DEFINITION  1     x  e  I7(a,  /?).=:  (3u)  (3v)  .we  OL.V  e  p.u^v.  uFx  .  vFx 

We  thus  use  '  £/(a,  ft)  '  to  denote  the  class  of  all  zygotes  which  are  formed  by 
the  union  of  a  gamete  belonging  to  the  class  a  with  one  belonging  to  the 
class  /?. 

DEFINITION  2    z  e  LE(%)  .  =  :  (3x)  (3y)  .  dlz  xyz  .  x  e  Z  .  y  e  E 

'LE(Zy  is  used  to  denote  the  class  of  all  lives  which  develop  from  a  zygote 
belonging  to  the  class  Z  in  an  environment  belonging  to  the  class  E. 

DEFINITION  3  HE  G#(X)  .  ^:(Bx)(3y)(3z)  .dlz  xyz.y  e  E  .z  e  X  .ugam  z 
'G^(X)'  thus  denotes  the  class  of  all  gametes  which  are  produced  by  lives 
belonging  to  the  class  X  when  they  develop  in  environments  belonging  to 
the  class  E. 

DEFINITION  4    z  e  FilK,M,E(X,  Y).=:  (3x]  (3y)  (3u)  (3v)  .  u  e  GK(X]  . 

v  e  GM(Y).uFx.vFx.  u^v.  y  e  E.  dlz  xyz 


The  letters  'Fit'  are  taken  from  the  word  'filial'.  The  above  definition 
provides  a  notation  for  the  class  of  all  offspring  which  develop  in  environ- 
ments belonging  to  the  class  E  and  having  one  parent  belonging  to  the 
class  X  and  developing  in  an  environment  belonging  to  the  class  K  and 
the  other  parent  belonging  to  the  class  Y  and  developing  in  an  environ- 
ment belonging  to  the  class  M.  For  Mendelian  contexts  only  one  en- 
vironmental class  need  be  considered  ;  provision  for  this  simplification  is 
made  below. 

The  above  four  definitions  suffice  for  most  purposes.  But  it  frequently 
happens  that  we  need  to  substitute  one  of  the  above  expressions  for  the 
variables  of  another  and  in  that  way  very  complicated  expressions  may 
arise.  In  order  to  avoid  this  the  following  abbreviations  are  introduced  by 


418  J.    H.    WOODGER 

definition  : 

DEFINITION  5  D(a,  ft,  E)  =  LE(U(oi,  ft)) 

DEFINITION  6  G(«,  ft,  E)  =  GE(LE(U(*,  ft))) 

DEFINITION  7  F'K,M,E(*>  P'>  7>  *)  =  FilK,MtB(D(*,  ft,  K),  D(y,  6,  M)) 

DEFINITION  8  F*(a,  ft;y,d)  =  F'E)E,E(<*,  ft  ',?,$) 

All  the  foregoing  notions  are  general  an^i  familiar  ones.  We  must  now 
turn  to  some  of  a  more  special  and  novel  kind.  If  our  present  inquiry  were 
not  confined  to  the  single  topic  of  Mendel's  First  Law  we  should  at  this 
stage  introduce  the  notion  of  a  genetical  system,  and  we  should  maintain 
that  genetical  systems  as  then  intended  constitute  the  proper  objects  of 
genetical  investigations.  But  for  the  present  purpose  it  suffices  if  we  speak 
of  a  specially  simple  kind  of  genetical  system  which  we  shall  call  genetical 
units.  A  genetical  unit  is  a  set  of  three  classes  :  one  is  a  phenotype,  another 
is  a  class  of  gametes  and  the  third  is  a  class  of  environments  ;  —  provided 
certain  conditions  are  satisfied.  Suppose  {P,  a,  E}  is  a  candidate  for  the 
title  of  genetical  unit  ;  then  it  must  be  development  ally  closed,  that  is  to  say 
D(a,  a,  E)  must  be  a  non-empty  class  and  it  must  be  included  in  the 
phenotype  P;  next  it  must  be  genetically  closed,  that  is  to  say  G(a,  a,  E) 
must  be  non-empty  and  must  be  included  in  a.  Thus  neither  the  process  of 
development  nor  that  of  gamete-formation  takes  us  out  of  the  system; 
it  thus  'breeds  true'.  The  official  definition  is: 

DEFINITION  9    Segenunit  =s:(3/>)(3a)(3E)  .P  ephen.S  =  {P,  a,  E]  . 

D(a,  a,  £)^A.D(a,  a,  £)CP.G(a,  a, 
G(a,  a,  £)Ca 


The  genetical  systems  with  which  Mendel  worked  were  genetical  units, 
sums  of  two  genetical  units  and  what  may  be  called  set-by-set  products 
of  such  sums.  Thus  if  {P,  a,  E}  and  {Q,  ft,  E}  are  genetical  units  with  the 
phenotype  P  dominant  to  the  phenotype  Q  we  shall  have  D(a,  ft,  E)^A 
and  D(«,  ft,  E)  C  P,  so  that  {P,  Q,  a,  ft,  E},  the  sum  of  the  two  units,  is 
developmentally  closed;  if  we  also  have  G(a,  ft,  E)  4=  A  and  G(a,  ft,  E)C. 
a  w  ft,  then  the  sum  is  also  genetically  closed.  As  we  shall  see  shortly  these 
assumptions  do  not  suffice  to  enable  us  to  infer  that  the  sum  will  behave 
according  to  the  Mendelian  generalizations.  If  {R,  y,  E}  and  {S,  d,  E}  are 
two  more  genetical  units  so  that  {R,  S,  y,  d,  E}  is  their  sum,  then  the 


STUDIES  IN  THE  FOUNDATIONS  OF  GENETICS  419 

set-by-set  product  of  this  sum  and  the  former  one  will  be 

{P  r>  R,  Q  r»  R,  P  n  S,  Q  o  5,  a  n  y,  ft  n  y,  a  n  (5,  0  ^  d,  E} 

and  if  it  is  development  ally  and  genetically  closed  this  will  constitute  yet 
another  type  of  genetical  system  which  was  studied  by  Mendel  and  with 
which  his  Second  Law  was  concerned. 

Before  we  can  proceed  with  the  biological  part  of  our  system  we  must 
now  say  something  about  the  set-theoretical  framework  within  which  it  is 
being  formulated  and  on  the  basis  of  which  proofs  of  theorems  are  carried 
out.  We  begin  with  two  important  definitions,  one  of  which  has  already 
been  mentioned.  (The  definitions  and  theorems  of  this  part  of  the  system 
will  have  Roman  numerals  assigned  to  them  in  order  to  distinguish  them 
from  biological  definitions  and  theorems). 

DEFINITION  I    X  e  pY  .  ==  .  —  (  —  -  —  -  =  p  .  0  <  p  ^  1 

pY  is  thus  the  set  of  all  classes  which  have  a  proportion  p  of  their  members 
belonging  to  Y.  N(X)  is  the  cardinal  number  of  the  class  X. 

DEFINITION  II    Ze[X,  Y]  =  :  (3u)(3v).u  eX.v  e  Y.u  4=  v.Z  ={  u,  v] 

[X,  Y]  is  the  pair-set  of  the  classes  X  and  Y,  that  is  to  say  it  is  the  set  of 
all  pairs  (unordered)  having  one  member  belonging  to  the  class  X  and  the 
other  to  the  class  Y. 

No  attempt  is  made  here  to  present  the  set-theoretical  background 
axiomatically.  We  simply  list,  for  reference  purposes,  the  following 
theorems  which  can  be  proved  within  (finite)  set  theory  and  arithmetic. 


THEOREM  I  N(X)  =  O.-.X  =  A 

THEOREM  II  X  C  Y.D.N(X)  <  N(Y) 

THEOREM  III  N(X  u  Y)  =  N(X  ^  Y)  +  N(X  ^  Y)  +  N(JP  ^  Y) 

THEOREM  IV  X  *  Y  =  A.D.N(X  w  Y)  =  N(X)  +  N(Y) 

THEOREM  V  N([X,  Y])  =  N(X  *  Y).N(X  ^  Y)  +  N(X  ^  Y)  . 

[N(X  o  7)  +  N(X  r*  Y)  +  N(X  n  Y)  —  1] 

THEOREM  VI  X^  Y  =  A.D.N([X,  Y])  =  N(X).N(Y) 

THEOREM  VII  X  3=A.XC  Y  .^.XelY 


420  J.    H.    WOODGER 

THEOREM  VIII  XepY^qY.D.  = 
THEOREM  IX   X  =f=  A.D.  (3/»)  . 


> 

THEOREM  X         X  +  A.XC  YvZ.Y^Z  =  A.== 

XepY  ^  (1 

THEOREM  XI        X  ^  y  =  A.D.(pX  n  ?y)  C  (£  +  ?)(X  w  Y) 
THEOREM  XII      Y  CP.Z  CQ.P^  Q  =  A.D.(pY  ^  (1  -  p)Z) 


THEOREM  XIII     Y  CA.Z  C  B.W  CC  .A  ^B  =  B^C  =  C^A 

=  A.D.(pY  nqZn  (1  -  p  —  q)W)  C  (pA  n  qB  ^  (1  -  p 
THEOREM  XIV     ^CP.J5CP.CC(2./lo^=5^C=Co^  =  P^(;=A. 

(pAr,qBr,(\-p-  q}C}C(p  +  q)P  o  (1  -  ^ 

THEOREM  XV      X^y  =  a^/?  =  A..X  <=an(\—)p. 


THEOREM  XVI     X  n  y  -=  a  ^  /?  =  A.D:  X  e  Ja  n  ^  .  Y  e  'a  ^  \p.  -  . 


We  can  now  return  to  the  biological  part  of  our  system.  In  genetical 
statements  the  notion  of  randomness  frequently  occurs.  It  will  be  re- 
quired in  two  places  in  the  present  context.  In  both  of  these  it  means 
persistence  of  certain  relative  frequencies  during  a  process.  It  means  the 
absence  of  selection  or  favouritism. 

We  shall  say  that  a  set  5  which  is  the  sum  of  two  genetical  units  is 
random  with  respect  to  U  or  that  the  union  of  the  gametes  is  random  in  S  if 
and  only  if  X  and  Y  being  any  classes  of  gametes  of  the  form  G(a,  /?,  E), 
a  and  ft  being  any  gamete  classes  and  E  the  environment  class  of  S, 
whenever  we  have 

[X,Y]EP[y,<5] 
we  also  have  U(X,  Y)  e  pU(y,  6) 

y  and  d  also  being  gamete-classes  of  5.  The  following  definition  covers 


STUDIES  IN  THE  FOUNDATIONS  OF  GENETICS  421 

cases  where  S  has  an  additional  phenotype  because  there  is  no  dominance. 
DEFINITION  10  S  e  rand  U.  ^:(*)(p)(y)(d)(C)(0):(E)(p)(3Si)(3S2): 

Si,  $2  e  genunit .  S  =  Si  w  $2 .  v .  (3R) .  R  e  phen . 

S  =  Si  w  S2  v  [R}.«,  ft  y,  (5,  f,  0,  £  e  S.[G(a,  ft  £), 

G(y,  <$,£)]  e#£,0].D. 

V(G(*,p,E),  G(y,  d,E))epU(£,0) 

Analogously  we  can  say  that  such  a  set  S  is  random  with  respect  to  D(E) 
or  that  development  in  members  of  the  environment  class  E  of  S  is  random 
if  and  only  if,  whenever  we  have 

U(G(*,p,E),G(y,d,E))epU(e90) 
we  also  have 

D(G(a,  ft  E),  G(y,  6,  E),  E)  e  pD(t,  0,  E) 

the  Greek  letters  all  being  variables  whose  values  are  the  gamete  classes 
of  Si  and  'E'  being  a  variable  whose  single  value  (in  Mendelian  cases)  is  the 
environmental  class  of  S. 

DEFINITION  11     S  e  rand  D(£) .  - : . (*)(P)(y)(d)(£)(0)(E)(p)  :(3Si)(3S2) : 
Si,S2egenunit.S=SivS2.y.(3R).Rephen. 

S=SivS2u{R}.*,  ft  y,  6,  f,  6),  £  e  S.  l/(G(a,  ft  £), 
G(y,  (5,  £))  6/)l/(f,  0):D.D(G(a,  ft  £),  G(y,  d,  £),  £)  6 
pD(t,  0,  E) 

We  now  give  a  list  of  biological  theorems  which  are  provable  from  the 
postulates  and  definitions  and  are  used  in  the  proofs  of  the  major  theorems 
to  follow.  On  the  right  hand  side  of  each  theorem  are  indicated  the 
postulates  (P),  definitions  (D)  or  theorems  (T)  required  for  its  proof. 

THEOREM  1  U(a,  ft  =  l/(ft  a)  [Dl. 

THEOREM  2  U(X,  X)  =  U(X*3,  X*$)  [Dl,  P3. 

THEOREM  3  ao£  =  A.D.  I7(a,  a)ol/(ft  ft)  =  A  [PI,  Dl. 

THEOREM  4  ar»^  =  A.D.  l/(a,  a)ol/(a,  ft  =  A  [PI,  Dl. 

THEOREM  5  E^K=A.  y.Zr>W=A  :D .  LE(Z)^LK (W)  =  A  [D2,  P2. 


422  J.    H.    WOODGER 

THEOREM  6    Ufa  a)  ^  U(ft,  ft)  = 

=  A.D.D(a,  a,  E)*D(ft,  ft,  E)  =  A  [D5,  D2,  P2. 
THEOREM  7     Ufa  a)  o  U(«,  0)  = 

=  A.D.D(a,  a,  E)  ^  D(a,  ft,  E)  =  A  [D5,  T5. 

THEOREM  8    a  ^  /9  =  A .  D .  />(«,  a,  E)  n  D(0,  /?,  E)  =--  A  [T3,  T6. 

THEOREM  9    <*nft  =  A.D.D(a,  a, E)  ^  D(a,  ft,E)=A  [T4,  T7 

THEOREM  10  D(X  ^  <J,  X  *  $,  £)  =  D(X,  X,  E)  [D5,  T2. 

THEOREM  11  a  ^  /J  =  A .  D .  G(a,  ft  £)  o  G(/5,  ft,E)=A       [D6,  D3,  D2, 

P5,  P6,  T4. 
THEOREM  12  G(«,  a,  £)  C  a.D. 

D(G(a,  a,  £),  G(a,  a,  £),  E)CDfa  a,  £)  [Dl,  D2,  D5. 

THEOREM  13  FEfa  ft]  y,  d)  = 

D(Gfa  ft,  E),  G(y,  6,  E),  E)       [D8,  D7,  D4,  D5,  D6,  Dl,  D2 
THEOREM  14  {P,  <*,E}egenunit.D.FEfa*',  a,  a)  CP         [T13,  D9,  T12. 

By  a  mating  description  is  meant  a  statement  of  the  form  X  C  Y  or 
X  E  pY  where  'X'  is  an  expression  denoting  a  set  of  offspring,  e.g. 
'Fj0(a,  ft]  a,  ft)'  and  'V  denotes  a  phenotype.  We  turn  now  to  the  task  of 
discovering  what  must  be  assumed  in  order  to  derive  the  characteristic 
Mendelian  mating  descriptions,  beginning  with  that  which  asserts  the 
relative  frequencies  of  dominants  and  recessives  in  the  offspring  of 
hybrids  when  these  are  mated  with  one  another.  For  reference  purposes 
it  will  be  convenient  if  we  use  abbreviations  for  groups  of  the  various 
separate  hypotheses  which  enter  into  the  antecedents  of  the  following 
theorems.  Let  us  therefore  put : 
H  1 .  for:  {P,  «,  E},  {<?,  ft,  E}  e genunits . P  ^Q  =  «^ft  =  A  ({P,  a,  E} 

and  {Q,  ft,  E}  are  genetical  units  and  P  and  Q,  and  a  and  ft,  are 

mutually  exclusive) 

H2.     for:     Dfaft,E)CP 

(the  hybrids  are  included  in  the  phenotype  P) 
H3.     for:     (3R).R  Ephen.Dfa  ft,  E)  CR 

(this  covers  cases  where  there  is  no  dominance  but  the  hybrids  are 


STUDIES  IN  THE  FOUNDATIONS  OF  GENETICS  423 

included  in  a  third  phenotype  R) 
H  4a.  for:     G(a,  ft,  E)  n  $  e  |a  o  ^.  G(a,  ft  £)  /^  $  e  Ja  ^£0 

(This  is  one  form  of  Mendel's  own  hypothesis.  He  assumed  that  in 
the  gametes  of  the  hybrids  the  two  kinds  occured  in  equal  numbers 
both  in  the  case  of  male  and  in  the  case  of  female  gametes.  Theorem 
XVI  shows  that  the  above  form  is  equivalent  to  this). 

H  4b.  for:     G(a,  ft  E)  n  £  =J=  A.G(a,  ft  E)  n  cJ  C  a  w  ft 
G(a,  ft  £)  n  ?  4=  A .  G(a,  ft  £)  n  ?  C  a  w  0 

(This  is  a  weaker  form  of  H  4a  because  it  only  assumes  non- 
emptyness  and  inclusion). 

H  Ac.   for:     G(a,  ft  £)  4=  A.  G(a,  ft  £)  C  a  w  0 

(This  is  weaker  still  because  it  does  not  make  separate  statements 
regarding  the  gametes  of  different  sex). 

H  5.     for :     S  =  {P,  a,  E}  w  {0,  ft  £} .  5  e  rand  D  ^  rand  U(E) 

(This  is  the  hypothesis  that  the  system  in  question  is  the  sum  of 
two  genetical  units  (H  1 )  and  is  random  both  with  respect  to  the 
union  of  the  gametes  and  also  with  respect  to  the  development  of 
the  resulting  zygotes  in  the  environments  belonging  to  E. 

H  5a.  for :     S  =  [P,  a,  E}  v  {Q,  ft  E}  w  {R}  and  S  e  rand  V  n  rand  D(E) 

(This  is  to  cover  the  cases  when  there  is  no  dominance). 
The  following  theorems  are  asserted  for  all  values  of  the  variables  P, 
Q,  R,  a,  ft,  E. 

THEOREM  15  states  that  if  we  have  H  1,  H  2,  H  4a  and  H  5  we  also 
have  three  quarters  of  the  offspring  of  the  hybrids  belonging  to  the  domi- 
nant and  the  remaining  quarter  to  the  recessive  phenotype. 

THEOREM  15    H  1  .H  2.H  4a.H  5.D.F*(a,  ft',  a,  ft)  e  |P  n  \Q 

In  order  to  make  all  the  steps  explicit  we  give  the  following  derivation  of 
this  theorem : 

(1)  Using  'X'  as  an  abbreviation  of  'G(a,  ft,  E)'  we  have,  by  H  1  and  P  4 

(X  n  £)  n  (X  r*  ?)  =  a  n  ft  =  A 

(2)  By  (1),  H  4a  and  T  XV  we  can  write: 

[X*  c?,  X~  $]  e  44[a,  a]  ^  2(i4)[a,  ft]  *  £.£[/?,  ft] 


424  J.    H.    WOODGER 

(3)  From  (2),  H  5  and  D  10  we  are  now  able  to  obtain: 

V(X  ~  &  X  «  $)  E  ttf(«,  a)  ~  itf(a,  /?)  ^  itf(ft  /J) 

(4)  We  next  obtain  from  (3),  H  5  and  D  1  1  : 

D(X  ^  J,  X^  $,£)<=  JD(«,  a,  E)  n  JD(a,  ft  E)  r>  JD(ft  /J,  E) 

(5)  From  H  1  ,  D  9  and  H  2  we  have  : 

D(a,  a,  E)CP  and  D(a,  ft  E)CP  and  D(ft  ft  E)C() 

(6)  From  H  1  we  have  oc  ^  ft  =  A  and  so  with  the  help  of  T  8  and  T  9 
we  get  : 


D(a,  a,  E)  n  D(a,  ft  E)  =  D(a,  ft  £)  n  D(/5,  ft  £)  - 

=  D(ft  ft  £)  n  D(a,  a,  £)  =  A 

(7)  From  (5)  and  (6)  with  the  help  of  T  XIV  we  now  get: 

lD(af  a,  E)  ^  *  (D(a,  ft  E)  n  JD(/J,  ft  E)  C  |P  o  J0 

(8)  By  T  10  we  have: 

D(X  n  (?,  X  n  ?,  E)  -  D(X,  X,  E) 

(9)  From  (4),  (7)  and  (8)  we  obtain: 


(10)  Putting  'G(a,  ft  E)'  for  'X'  in  (9)  in  accordance  with  (1)  : 

D(G(a,  ft  E),  G(a,  ft  E),  e  |P  n  J(? 

(11)  By  substitution  of  'a'  for  '/  and  '/T  for  '6'  in  T  13  we  get: 

FB(a,  j8;  a,  ft)  =  D(G(a,  /J,  £).  G(a,  /»,  £),  £) 

(12)  Finally  from  (10)  and  (11)  we  obtain  the  required  result: 

**(«,/?  ;«,0efi^i0 
Before  commenting  on  this  we  shall  give  the  remaining  theorems. 

THEOREM  16  is  concerned  with  the  offspring  of  hybrids  when  mated 
with  the  recessive  parents  ;  a  mating  type  commonly  called  a  back-cross. 
It  is  stated  here  in  a  somewhat  unusual  from  and  with  the  weakest 
possible  antecedent.  It  states  that  if  the  hypotheses  H  1  ,  H  2,  H  4c  and 
H  5  are  adopted  then  we  should  expect  the  proportions  of  the  two  pheno- 


STUDIES  IN  THE  FOUNDATIONS  OF  GENETICS  425 

types  in  the  offspring  to  be  identical  with  the  proportions  of  the  two 
kinds  of  gametes  in  the  gametes  produced  by  the  hybrids.  If,  therefore, 
we  assume,  on  the  basis  of  samples,  that  Fs(p,  p',  a,  /?)  e  \P  r\  ±Q  we  must 
also  assume  that  G(a,  /?,  E)  e  £a  ^  J/5.  The  first  of  these  hypotheses  is 
epistemically  prior  to  the  second  and  yet  they  both  occur  together  in  the 
consequent  of  this  theorem. 

THEOREM  16     H  1  .H  2.H  4c.H  5.D.(3p)  .FE,(p,  ft;  <*,  p)  epP^(\—p)Q. 

G(<*,p,E)epccr>  (1  —  p)p 

The  derivation  of  Theorem  16  requires:  T  13,  T  1  1,  T  X,  T  XV,  T  VII, 
D9D  10,  D  11  and  T  XII. 

In  the  next  theorem  we  have  the  same  antecedent  as  in  Theorem  15 
except  that  nothing  is  assumed  about  the  relative  proportions  of  the  two 
kinds  of  gamete  in  the  gametes  produced  by  the  hybrids. 

THEOREM  17     H  1  .H  2.H  4b.H  5.D.(3p)(3q).FE(*.  p',  a,  p)  e 


G(x,p,E)  ^  cJe^arN  (1   —  p)p.G(*t  p,  E) 

In  this  case,  if  we  assume,  as  a  result  of  sampling,  that  (p—pq+q)  =  | 
and  (1  —  p}(\  —  q)  —  J  we  cannot  determine  the  value  of  p  and  of  q.  But 
if  p  has  first  been  ascertained  with  the  help  of  THEOREM  16  and  sampling 
then  (at  least  when  p  =  q)  the  result  can  be  applied  to  THEOREM  17.  For 
the  derivation  of  this  theorem  we  require  T  X,  P  4,  T  XV,  D  10,  D  1  1, 
T  2,  D  5,  D  9,  T  9,  T  8,  T  13,  T  XIV.  The  next  next  theorem  is  the  theo- 
rem corresponding  to  THEOREM  17  in  systems  where  there  is  no  dominance. 

THEOREM  18     H  1  .H  4b.H  5a.D.(3#)(3?).F*(a,  p\  a,  p) 


o  c?e£a^  (1  —  p)p.G(a,p,E)  n  ?  e  ?a  o  (1  —  q)p 

In  this  case,  if  on  the  basis  of  sampling  we  assign  a  value  to  pq  and  to 
(p(\  —  q)  +  q(l  —  p)),  then  we  can  determine  the  values  of  p  and  q.  The 
theorem  requires:  T  X,  P  4,  T  XV,  D  10,  Dll,  T  2,  D  5,  T  9,  T  8, 
T  XIII,  T  13. 

Finally  a  theorem  will  be  given  which  might  have  been  known  to 
Mendel.  It  is  an  example  of  a  system  which  includes  only  one  genetical 
unit.  Suppose  F  and  M  are  the  females  and  males  respectively  of  some 
species,  suppose  further  that  g  and  h  are  two  mutually  exclusive  classes 


426  J.    H.    WOODGER 

of  gametes  and  H  a  class  of  environments  all  satisfying  the  following 
conditions:  (i)  {F,  g,  H}  is  a  genetical  unit;  (ii)  D(g,  h,  H)=+=A. 
D(g,  h,  H)CM.G(g,  h,  H)  *  A.G(g,  h,H)Cgvh.  (iii)  D(h,  h,  H)  =  A 
(therefore  {M,  h,  H}  is  not  a  genetical  unit)  ;  (iv)  5  =  {F,  M,  g,  h,  H}  and 
S  is  rand  U  r»  rand  D(H).  If  these  conditions  are  satisfied  we  shall 
have: 


.  g;  g,  fy  e  |F  o  \M  if  and  only  if  G(g,  h,  H)  e  Jg  n  \h 

THEOREM  19    F^M=  g  r»  h  =  A.{F,  g,  H{e  genunit.D(g,  h,  H)4=A. 

D(g,  h,  H)CM.G(g,  h,  H)+A.G(g,  h,  H)Cgvh.S  = 
=  {F,  M,  g,  h,  H}.5  e  rand  U  o  rand  D(H)  .D. 
g;  g,  fc)  e  JF  ^M.  ^  .  G(g,  h,  H)  e  Jg  o  ifc 


This  theorem  requires  for  its  derivation  T  1  1,  T  XV,  T  VII,  D  10,  D  1  1, 
TXII,  T  13. 

We  can  now  see  clearly  what  was  Mendel's  discovery  in  the  Christopher 
Columbus  sense  and  what  was  his  discovery  in  the  J.  J.  Thomson  sense 
distinguished  above.  His  discovery  in  the  first  sense  (inductive  hypothe- 
sis) was  the  f  P  n  \Q  frequencies  in  the  offspring  when  hybrids  are  mated, 
if  this  is  understood  as  being  asserted  (as  above)  for  inaccessible  sets. 
This  is  expressed  in  THEOREM  15.  Mendel's  discovery  in  the  second  sense 
(explanatory  hypothesis)  is  the  hypothesis  that  is  expressed  in  H  4a.  But 
we  have  seen  that  in  this  form  it  is  unnecessary.  The  much  weaker  form 
of  H  4c  suffices,  especially  if  we  begin  with  T  1  6  and  then,  using  its 
results  with  the  value  of  p  determined  by  sampling  (coupled  with  the 
additional  hypothesis:  p  =  q),  we  pass  to  T  17.  Where  there  is  no  do- 
minance (in  Mendel's  experiments  one  phenotype  is  in  each  case  dominant 
to  the  other)  p  and  q  can  be  determined  independently  of  T  16  with  the 
help  of  T  18.  Thus  the  convenient  minimum  assumption  is  H  4b.  It  could 
be  argued  that  the  assumption  of  two  kinds  among  the  gametes  of 
hybrids  is  not  so  much  a  discovery  of  the  second  kind  as  a  special  appli- 
cation of  a  general  causal  principle  to  embryology  and  genetics.  But  this 
does  not  mean  that  it  cannot  be  discussed. 

It  is  often  said  that  Mendel  discovered  what  is  called  particulate 
inheritance.  But,  except  in  the  sense  in  which  gametes  are  particles, 
Mendel  did  not  specifically  speak  of  particles.  Strictly  speaking  a  hypothe- 
sis involving  cell  parts  only  becomes  important  when  we  consider  the 


STUDIES  IN  THE  FOUNDATIONS  OF  GENETICS  427 

breakdown  of  Mendel's  Second  Law.  The  whole  of  Mendel's  work  can  be 
expressed  with  the  help  of  D(a,  ft,  E),  G(a,  p,  E)  and  FE(*.  ff'.y.d)  and 
thus  in  terms  of  gamete  and  environment  classes,  the  classes  of  zygotes 
which  can  be  formed  with  them  and  the  classes  of  lives  which  develop 
from  the  zygotes  in  the  environments. 

The  above  analysis  has  shown  the  central  role  which  is  played  by  the 
hypotheses  of  random  union  of  the  gametes  and  of  random  development 
in  obtaining  the  Mendelian  ratios  (see  especially  steps  (2),  (3)  and  (4)  in 
the  proof  of  Theorem  15).  These  do  not  receive  the  attention  they  deserve 
in  genetical  books.  Sometimes  they  are  not  even  mentioned.  This  is 
particularly  true  of  the  hypothesis  of  random  development.  That  Mendel 
was  aware  of  it  is  clear  from  the  following  passage  in  the  translation  from 
which  we  have  already  quoted  (p.  340) : 

A  perfect  agreement  in  the  numerical  relations  was,  however,  not  to  be  ex- 
pected, since  in  each  fertilization,  even  in  normal  cases,  some  egg  cells  remain 
undeveloped  and  subsequently  die,  and  many  even  of  the  well-formed  seeds 
fail  to  germinate  when  sown. 

In  addition  to  the  special  hypotheses  H  1  to  H  5  there  are  also  the  six 
postulates  to  be  taken  into  consideration.  Any  departure  from  these 
could  affect  the  result.  This  provides  plenty  of  scope  for  reflexion.  But 
perhaps  the  most  striking  feature  of  the  Mendelian  systems  is  the  fact 
that  only  one  class  of  environments  is  involved  and  is  usually  not  even 
mentioned.  Some  interesting  discoveries  may  await  the  investigation  of 
multi-environmental  systems.  Provision  for  this  is  made  in  Definitions 
4,  7  and  1 1  and  a  variable  having  classes  of  environments  as  its  values 
accompanies  all  the  above  biological  functors.  At  the  same  time  attention 
should  be  drawn  to  the  fact  that  no  provision  is  made,  either  here  or  in 
current  practice,  for  mentioning  the  environments  of  the  gametes.  And 
yet  it  is  not  difficult  to  imagine  situations  in  which  the  necessity  for  this 
might  arise. 

It  will  be  noticed  that  no  use  has  here  been  made  of  the  words  'proba- 
bility', 'chance',  or  'independent',  although  these  words  are  frequently 
used  in  genetical  books  with  very  inadequate  explanation.  Here  the  term 
'random'  has  been  used  but  its  two  uses  have  been  explained  in  detail. 
In  passing  it  may  be  mentioned  that  'S  is  random  with  respect  to  FE  is 
also  definable  along  analogous  lines  and  then  the  Pearson-Hardy  law 
is  derivable. 

In  conclusion  I  should  like  to  draw  attention  to  the  way  in  which  the 


428  J.    H.    WOODGER 

foregoing  analysis  throws  into  relief  the  genius  of  Mendel,  which  enabled 
him  to  see  his  way  so  clearly  through  such  a  complicated  situation.  I  also 
wish  to  express  my  thanks  to  Professor  John  Gregg  of  Duke  University 
and  to  my  son  Mr  Michael  Woodger  of  the  National  Physical  Laboratory 
for  their  help  in  the  preparation  of  this  article. 


Symposium  on  the  Axiomatic  Method 


AXIOMATIZING  A  SCIENTIFIC  SYSTEM  BY  AXIOMS  IN 
THE  FORM  OF  IDENTIFICATIONS 

R.   B.   BRAITHWAITE 

University  of  Cambridge,  Cambridge,  England 

A  scientific  deductive  system  ("scientific  theory")  is  a  set  ot  propo- 
sitions in  which  each  proposition  is  either  one  of  a  set  of  initial  propositions 
(a  "highest-level  hypothesis")  or  a  deduced  proposition  (a  "lower-level 
hypothesis")  which  is  deduced  from  the  set  of  initial  propositions  ac- 
cording to  logico-mathematical  principles  of  deduction,  and  in  which  some 
(or  all)  of  the  propositions  of  the  system  are  propositions  exclusively 
about  observable  concepts  (properties  or  relations)  and  are  directly 
testable  against  experience.  In  this  paper  these  testable  propositions  will 
be  taken  to  be  empirical  generalizations  of  the  form  Every  A  -specimen  is 
a  /^-specimen,  whose  empirical  testability  consists  in  the  fact  that  such  a 
proposition  is  to  be  rejected  if  an  A  -specimen  which  is  not  a  /^-specimen 
is  observed.  (Statistical  generalizations  of  the  form  The  probability  of  an 
A  -specimen  being  a  /^-specimen  is  p  can  be  brought  within  the  treatment ; 
here  testability  depends  upon  more  sophisticated  rejection  rules  in  terms 
of  the  proportions  of  /^-specimens  in  observed  samples  of  ^4 -specimens.) 
The  object  of  constructing  a  scientific  theory  is  to  'explain*  empirical 
generalizations  by  deducing  them  from  higher-level  hypotheses. 

A  scientific  deductive  system  will  make  use  of  a  basic  logic  independent 
of  the  system  to  provide  its  principles  of  deduction.  It  will  be  convenient 
to  assume  that  this  basic  logic  includes  all  the  deductive  principles  of  the 
system,  so  that  none  of  these  are  specific  to  the  system  itself  and  the 
deductive  power  of  the  system  will  be  given  by  the  addition  to  the  basic 
logic  of  the  system's  set  of  initial  propositions.  The  system  can  then  be 
expressed  by  a  formal  axiomatic  system  (called  here  a  calculus)  in  which 
the  axioms  (the  "initial  formulae")  fall  into  two  sets,  one  set  consisting 
of  those  axioms  required  for  the  basic  logic  of  the  system  (which  set  will 
be  empty  if  the  basic  logic  has  no  axioms)  —  no  axiom  of  this  set  will 
contain  any  extra-logical  constants  —  and  another  set  of  axioms  con- 
taining non- vacuously  extra-logical  constants  (Tarski's  proper  axioms 
[10,  p.  306])  corresponding,  one  to  one,  to  the  set  of  initial  propositions 

429 


430  R.    B.    BRAITHWAITE 

of  the  scientific  system.  The  rules  of  derivation  of  the  calculus  will  then 
correspond  to  the  deductive  principles  of  the  basic  logic.  Since  we  are  not 
concerned  with  the  nature  of  this  basic  logic  we  shall  ignore  the  axioms 
and  theorems  of  the  calculus  which  forms  a  sub-calculus  representing  the 
basic  logic  and  shall  only  be  interested  in  proper  axioms  and  proper 
theorems  (i.e.  those  which  contain  non-vacuously  extra-logical  constants, 
which  will  be  called  primitive  terms)  and  which  are  interpreted  as  re- 
presenting the  propositions  of  the  scientific  system.  The  theorems  (or 
axioms)  representing  the  directly  testable  propositions  will  be  called 
testable  theorems  (or  axioms),  and  the  primitive  terms  occurring  in  these 
theorems  observable  terms. 

The  problem  raised  by  scientific  deductive  systems  for  the  philosophy 
(or  logic  or  semantics)  of  science  is  to  understand  how  the  calculus  is  inter- 
preted as  expressing  the  system.  If  all  the  proper  axioms  are  testable 
axioms,  and  consequently  all  the  proper  theorems  are  testable  theorems, 
there  is  no  difficulty,  since  all  the  extra-logical  terms  (i.e.  primitive  terms) 
occurring  in  the  calculus  are  observable  terms  so  that  all  the  proper 
axioms  and  theorems  can  be  interpreted  as  propositions  directly  testable 
by  experience.  The  semantic  rules  for  the  interpretation  of  the  calculus  by 
means  of  direct  testability  apply  equally  to  all  the  proper  axioms  and 
theorems ;  so  the  calculus  can  be  interpreted  all  in  a  piece. 

But  the  situation  is  different  for  the  deductive  system  of  a  more  ad- 
vanced science  which  makes  use  in  its  initial  propositions  of  concepts  (call 
them  theoretical  concepts)  which  are  not  directly  observable,  so  that  the 
propositions  containing  these  are  not  directly  testable.  Here  the  axioms 
of  the  calculus  contain  primitive  terms  which  are  not  observable  terms, 
and  these  theoretical  terms  have  to  be  given  an  interpretation  not  by  a 
semantic  rule  concerning  direct  testability  but  by  the  fact  that  testable 
theorems  are  derivable  from  them  in  the  calculus.  The  calculus  is  thus 
interpreted  from  the  bottom  upwards :  the  testable  theorems  are  interpreted 
by  a  semantic  rule  of  direct  testability,  and  the  other  theorems  and  axioms 
are  then  interpreted  syntactic-osemantically  by  their  syntactic  relations 
to  the  testable  theorems.  Theoretical  terms  are  not  definable  by  means 
of  observable  concepts  —  the  'reductionist'  programme  of  thorough- 
going logical  constructionists  and  operationalists  cannot  profitably  be 
applied  to  the  theoretical  concepts  of  a  science  —  though  they  may  be 
said  to  be  implicitly  defined  by  virtue  of  their  place  in  a  calculus  which 
contains  testable  theorems.  The  empirical  interpretation  of  the  calculus 
is  thus  given  by  a  directly  empirical  interpretation  of  the  testable  axioms 


AXIOMS   IN   THE   FORM   OF   IDENTIFICATIONS  431 

and  theorems  and  an  indirectly  empirical  interpretation  of  the  remainder. 
(For  all  this  see  R.  B.  Braithwaite  [3,  Chapter  III].) 

In  order  that  a  calculus  containing  theoretical  terms  should  be  able  to 
be  interpreted  in  this  indirectly  empirical  way,  it  is  necessary  that  each 
of  the  observable  terms  should  occur  in  at  least  one  of  the  proper  axioms. 
These  may  be  divided  into  three  categories:  (1)  Testable  axioms  whose 
primitive  terms  are  all  observable  terms;  (2)  Axioms  whose  primitive 
terms  are  all  theoretical  terms  :  these  will  be  called  Campbellian  axioms, 
since  collectively  they  represent  N.  R.  Campbell's  "hypothesis"  con- 
sisting of  "statements  about  some  collection  of  ideas  which  are  character- 
istic of  the  [scientific]  theory"  ([4],  p.  122),  and  the  highest-level  hy- 
potheses represented  by  them  will  be  called  Campbellian  hypotheses', 
(3)  Axioms  whose  primitive  terms  are  both  observable  terms  and  theo- 
retical terms  :  these  will  be  called  dictionary  axioms,  since  they  correspond 
to  Campbell's  "dictionary".  Since  no  philosophical  problems  arise  in 
connexion  with  testable  axioms,  we  will  suppose  that  there  are  no  testable 
axioms  in  the  calculus,  so  that  no  direct  empirical  interpretation  is 
possible  at  the  axiom  level.  To  simplify  our  discussion  we  will  further 
suppose  that  each  dictionary  axiom  is  of  the  form  of  an  identity 


where  a  is  an  observable  term  standing  alone  on  the  left-hand  side  of  the 
identity  with  the  right-hand  side  containing  only  theoretical  terms  Ai,  A£, 
etc.  as  primitive  terms.  Dictionary  axioms  in  this  form  will  be  called 
identificatory  axioms,  since  they  may  be  said  to  'identify'  an  observable 
term  by  means  of  theoretical  terms.  In  order  that  these  identificatory 
axioms  should  be  able  to  function  in  a  calculus  to  be  interpreted  as  a 
scientific  system,  the  basic  logic  governing  the  identity  sign  will  be 
assumed  to  be  strong  enough  to  permit  the  derivation  from  an  axiom  of 
the  form  a  =  (....  AI  .  .  ^2  .  .  .  .  )  of  every  theorem  obtained  by  substi- 
tuting a  for  (  ----  AI.  .^2.  .  .  .)  at  any  place  in  any  axiom  or  theorem  in 
which  (  .  .  .  .  AI  .  .  ^2  .  .  .  .  )  occurs. 

The  simplified  calculi  to  be  considered  will  thus  contain,  as  proper 
axioms,  Campbellian  axioms  concerned  with  the  theoretical  terms  of  the 
scientific  calculus  and  identificatory  axioms  relating  the  observable  terms 
of  the  calculus  to  the  theoretical  terms  by  identifying  each  of  the  former 
with  a  logical  function  of  the  latter.  If  a  is  an  w-ary  predicate,  an  identi- 
ficatory axiom  a  =  (  ----  AI.  .^2  ----  )  will,  with  a  suitable  basic  logic, 


432  R.    B.    BRAITHWAITE 

permit  the  derivation  from  this  axiom  of 

).  ..(xn)(a(xi,  x2,  ...,xn)  =  Q(XI,  x2,  .  .  .,*»)), 


where  Q  is  an  abbreviation  for  (.  .  .  .fa.  .fa-  •  •  •),  so  that  a  will  be  de- 
finable with  respect  to  the  identificatory  axiom  (together  with  the  basic 
logic)  in  terms  of  fa,  fa,  etc.  in  E.  W.  Beth's  sense  of  "definable"  [(2],  p 
335).  (In  [3,  p.  57]  I  called  sentences  of  the  form  a  =  (  ____  fa.  .  fa  ____  ) 
definitory  formulae',  but  I  now  prefer  to  call  them  identificatory  axioms 
(or  theorems)  and  to  reserve  the  word  "definition"  and  its  cognates  for 
notions  which  are  semantical  and  not  purely  syntatical.) 

Most  axiomatizations  of  a  scientific  theory  contain  Campbellian  axioms 
among  their  proper  axioms.  Philosophers  of  science  frequently  think  that 
it  is  the  Campbellian  axioms  representing  the  Campbellian  hypotheses 
which  express  the  essence  of  the  theory,  the  dictionary  axioms  (which  in 
the  simplest  cases  are  identificatory  axioms)  having  the  function  of 
'semantical  rules'  or  'co-ordinating  definitions'  or  'definitory  stipulations' 
relating  the  observable  terms  to  the  theoretical  terms.  Thus  there  would 
be  an  absolute  distinction  between  Campbellian  and  dictionary  axioms. 
It  would  follow  from  this  point  of  view  that  a  calculus  which  makes  use 
of  theoretical  terms  must  include  Campbellian  axioms  if  it  is  to  be  inter- 
preted to  express  what  is  of  importance  in  the  scientific  theory.  This, 
however,  is  not  the  case.  Calculi  whose  proper  axioms  are  all  identificatory 
can  serve  to  express  empirical  deductive  systems  :  indeed,  given  a  calculus 
which  contains  Campbellian  axioms,  it  is  sometimes  possible  to  construct 
another  calculus  having  the  same  theoretical  terms  whose  proper  axioms 
are  all  identificatory  which  is  testably  equivalent  to  the  first  calculus  in  the 
sense  that  the  testable  theorems  of  the  two  calculi  are  exactly  the  same. 

This  will  always  be  the  case  if  the  basic  logic  of  the  calculus  is  simple 
enough.  We  will  consider  the  case  in  which  the  basic  logic  is  merely  that  of 
prepositional  logic  combined  with  that  of  the  first-order  monadic  predi- 
cate calculus  with  identity  (and  with  a  finite  number  of  predicates).  This 
basic  logic  is  also  that  of  finite  Boolean  lattices,  and  it  will  be  convenient 
to  regard  it  as  expressed  by  a  calculus  (called  a  Boolean  calcuhis)  whose 
logical  constants  are,  besides  those  of  the  prepositional  calculus,  constants 
whose  class  interpretation  is  union  (J)t  intersection  (o),  complementation 
('),  the  universe  class  (e),  the  null  class  (o)  ,class  inclusion  (C)  and  class 
identity  (=).  This  basic  logic  is  sufficient  for  the  construction  of  theories 
in  which  empirical  generalizations  of  the  form  Every  A  B.  .-specimen  is  a 
/C-specimen  (represented  in  the  calculus  by  a  testable  theorem 


AXIOMS    IN    THE    FORM    OF   IDENTIFICATIONS  433 

(ar^br*  ...)Ck,  a  being  interpreted  as  designating  the  class  of  A- 
specimens,  and  similarly  for  the  other  small  italic  letters)  are  explained 
as  deducible  from  initial  propositions  containing  theoretical  class-concepts 
designated  by  AI,  A«2,  .  .  .  (Simple  examples  of  such  theories  are  given  in 
[3,  Chapters  III  and  IV].)  Since  all  the  propositions  concerned  will  be 
universal  propositions  (i.e.,  of  the  form  Every  ...  -specimen  is  a 
specimen),  every  formula  of  the  calculus  is  equivalent  to  a  formula  in 
normal  form  ...  —  o. 

Let  @i  be  a  calculus  of  this  type  comprising  n  identificatory  axioms 
DI,  1)2,  .  .  .  Dn  identifying  the  n  observable  terms  a\,  U2,  ...  an  by  means 
of  /  theoretical  terms  AI,  A2,  ...  Aj. 

Dr  is  ar  =  Ar,  where  Ar  is  a  Boolean  expression  whose  terms  are  all 
theoretical  terms.  Let  the  calculus  also  comprise  m  Campbellian  axioms 
Ci,  C<2,  .  .  .  Cm  containing  theoretical  terms  alone.  Derive  from  Cr  the 
equivalent  formula  Fr  —  e,  where  I\  is  a  Boolean  expression  whose  terms 
are  all  theoretical  terms,  and  let  F  be  (T\  r\  F2r\  ...  Fm). 

Now  consider  a  related  calculus  ©2  containing  the  same  observable  and 
theoretical  terms  but  with  no  Campbellian  axioms.  Let  its  n  identificatory 
axioms  be  E\,  £2,  .  .  .  En,  where  Er  is  ar  —  (Ar  o  F).  We  will  prove  that 
(under  a  weak  condition)  @2  is  testably  equivalent  to  @i  in  that  the 
testable  theorems  in  each  calculus,  obtained  in  each  case  by  eliminating 
the  theoretical  terms  from  the  axioms,  are  the  same. 

The  proof  depends  upon  the  classical  theory  of  elimination  of  variables 
from  Boolean  equations  and  is  a  development  of  a  result  of  A.  N.  White- 
head  [11,  p.  60,  (5)  and  p.  65,  (1)].  Consider  the  'universe'  of  the  /  theo- 
retical terms  AI,  A2,  .  .  .Aj  (these  are  common  to  both  Si  and  ©2).  The 
2l  minimals  (Ai  r\  AS  ri  .  .  .Aj),  (Ai  r*  A2  ^  .  .  .A/),  .  .  .  (Ai'  o  fa'  ^  ...  Aj') 
form  a  partition  of  the  universe  (in  accordance  with  the  basic  logic  of 
finite  Boolean  lattices),  i.e.  using  a  suffixed  JJL  to  designate  a  minimal, 

fir  r^  fj,s  =  o  for  Y  =}=  s;  U  pi  —  e.  Then  Ar,  the  Boolean  expression  whose 
i 

terms  are  all  theoretical  terms  which  is  identified  with  ar  by  the  identi- 
ficatory axiom  Dr  of  @i,  is  the  union  of  the  minimals  in  some  sub-set  of 
the  minimals;  and  Dr  is  equivalent  to  ar  =  U  pi  and,  in  normal  form,  to 


(arf  ^  U  pi)  v  (ar  ri  U  pt)  =  o. 

i'.fjuCAr  i:mC/lr' 

If  D  is  Di.Dz  ____  Dn,  D  is  then  equivalent  to   U  (A$  o  ^)  =  o,  where 

A,is  U  */w  U  aj). 


434  R.    B.    BRAITHWAITE 

If  C  is  Ci.C2.  .  .Cm  (the  conjunction  of  the  Campbellian  axioms),  C  is 
equivalent  to  U  m  =  o.  C  .  D  is  then  equivalent  to 

U  (At  o  fjn)  w  U  (0  r»  ^)  =  o. 


The  resultant  in  normal  form  RI  of  eliminating  all  the  minimals  from 
C.D  is  0  Ai  =  o.  RI  is  equivalent  to  the  conjunction  of  all  the  testable 


theorems;  so  a  testable  formula  T  is  a  theorem  of  @i  if  and  only  if 
R!  D  T. 

By  a  similar  argument  applied  to  the  axioms  of  ©2,  Er  is  equivalent  to 
ar  =  U  ^i  and,  in  normal  form,  to 

i:jt«C(Jr^r) 

(ar'  ^  U  /^)  w  (<zr  r*  U  /-*$)  ^  (ar  o  U  //$)  —  o. 


If  £  is  Ei.E2.  .  .En,  E  is  then  equivalent  to  U  (B$  o  ^)  —  o,  where, 

i 

for  an  t  such  that  /*$  C  jT,   B^  is  (U  fly'  w  (J  «;)  ; 

j:incAi     jiptcAj' 
for  an  t  such  that  /^  C  T',  B^  is  U  cij. 

f 
The  resultant  in  normal  form  /^  of  eliminating  all  the  minimals  from  E  is 

fl  Bt-  =  o,  which,  since  B$  —  A?-  for  every  i  such  that  m  C  1\  is  equi- 

i 

valent  to  D  A<  ^  U  fly  =  o.  A  testable  formula  T  is  a  theorem  of  ©2 

i:/*icr  y 

if  and  only  if  R2  D  r. 

Now  impose  the  weak  condition  that  F  should  not  be  wholly  included 

within  U  A],  i.e.    F  =\=  (/"o  U  Aj).    Under   this   condition    there   is   at 

i  i 

least  one  minimal,  say  jLts,  which  is  such  that  both  ^8  C  F  and  p,s  C  Aj 

for  every  /.  Then  for  this  s,  As  —  Bs  —  U  af,    and   R2,    like   RI,    is 

/ 

fl  A^  =  o.  Hence  /^i  =  R2\  and  T  is  a  testable  theorem  of  @i  if  and  only 
iifjiicr 
if  T  is  a  testable  theorem  of  @2. 

Thus,  unless  the  Campbellian  axioms  C  of  @i  restrict  the  universe  of 
theoretical  terms  to  a  class  F  which  is  included  in  the  union  of  all  the 
observable  terms  according  to  their  identifications  in  @i,  the  calculus  ©2, 
constructed  from  ©i  by  omitting  its  Campbellian  axioms  and  substituting 
(Ar  o  71)  lor  Ar  in  each  of  its  identificatory  axioms,  is  testably  equivalent 
to  ©i  in  the  sense  that  every  testable  theorem  of  the  one  is  also  a  testable 
theorem  of  the  other. 


AXIOMS   IN    THE    FORM   OF   IDENTIFICATIONS  435 

In  Whitehead's  language  [11,  p.  59]  each  identificatory  axiom  is  un- 
limiting  with  respect  to  all  the  theoretical  terms  simultaneously  in  the 
sense  that  the  resultant  of  eliminating  the  observable  term  from  the 
axiom  is  equivalent  to  o  =  0,  a  theorem  of  the  basic  logic.  A  calculus 
such  as  ©2  whose  proper  axioms  are  all  identificatory  therefore  imposes 
no  limitation  upon  the  universe  (Whitehead's  field)  of  the  theoretical 
terms.  Such  a  limitation  is  imposed  by  the  Campbellian  axioms  of  @i. 
But,  if  this  limitation  restricts  the  theoretical-term  universe  to  a  universe 
which  falls  wholly  within  the  class  which  is  the  union  of  all  the  observable 
terms,  it  will  be  impossible  in  the  future  to  adapt  the  scientific  theory 
expressed  by  a  calculus  using  theoretical  terms  limited  in  this  way  to 
explain  new  empirical  generalizations  relating  some  of  the  observable 
concepts  to  new  observable  concepts  not  concerned  in  the  original  theory 
[3,  pp.  73ff.].  An  axiomatization  of  a  scientific  theory  which  is  capable  of 
being  adapted  in  this  way  must  not  impose  such  a  drastic  limitation  upon 
its  theoretical  terms.  So  our  result  may  be  put  in  the  form  that  to  every 
adaptable  calculus  comprising  Campbellian  axioms  a  testably  equivalent 
calculus  can  be  constructed  all  of  whose  proper  axioms  are  identificatory. 

This  result  has  been  established  only  for  a  scientific  system  which 
makes  use  of  a  very  simple  basic  logic  (that  of  finite  Boolean  lattices) ;  and 
the  extent  to  which  it  can  be  generalized  to  apply  to  systems  comprising 
Campbellian  hypotheses  and  using  more  powerful  basic  logics  requires 
investigation.  That  it  is  possible  to  have  a  theory  using  a  mathematical 
basic  logic  in  whose  calculus  theorems  are  derived  from  identificatory 
axioms  alone  is  shown  by  such  a  simple  example  as  that  of  explaining 
02  _|_  i)2  _  \t  where  a  and  b  stand  for  observably  determined  numbers,  by 
identifying  a  with  sin  0  and  b  with  cos  6,  0  being  a  theoretical  'parameter'. 

One  obvious  qualification  must  be  made.  If  identificatory  axioms  in  the 
form  of  a  description  a  =  (w)(<f>(x))  are  permitted,  and  if  their  underlying 
logic  is  similar  to  Russell's  doctrine  of  descriptions  in  that  (3x)(</>(x))  is 
derivable  from  any  formula  containing  (i#)  (<£(#)),  an  identificatory  axiom 
for  a  calculus  ©3  of  the  form  ar  =  (w)(x  =  Ar.F  ==  e)  would  imply  both 
ar  ==  Ar  and  F  =  e,  and  all  the  axioms  of  @i  would  be  derivable  from 
those  of  @3,  a  stronger  system.  But  every  theoretical  scientist  would 
regard  the  proposal  to  substitute  a  theory  expressed  by  ©3  for  one  ex- 
pressed by  @i  as  a  logician's  trick.  So  for  scientific  discussion  the  notion 
of  identificatory  axiom  must  be  restricted  to  one  from  which  alone  no 
Campbellian  theorem  can  be  derived,  i.e.  an  identificatory  axiom  must  be 
unlimiting  with  respect  to  all  its  theoretical  terms  simultaneously. 


436  R.    B.    BRAITHWAITE 

The  possibility,  in  suitable  cases,  of  constructing  a  testably  equivalent 
calculus  comprising  only  identificatory  proper  axioms  is  very  relevant 
to  the  discussion  among  philosophers  of  science  as  to  whether  or  not  some 
of  the  highest-level  hypotheses  of  the  scientific  theory  expressed  by  the 
calculus  should  be  regarded  as  analytic  or  logically  necessary  rather  than 
as  factual  or  contingent.  It  is  admitted  by  all  empiricists  that  the  con- 
junction of  all  the  hypotheses  must  be  contingent,  since  together  they 
have  empirically  testable  consequences.  But,  if  the  highest-level  hypothe- 
ses contain  theoretical  concepts,  it  is  never  from  one  of  these  hypotheses 
alone  but  always  from  a  conjunction  of  them  that  testable  propositions 
are  deducible;  and  so  the  possibility  is  left  open  that  some  of  these 
hypotheses  are  not  contingent,  and  hypotheses  representing  dictionary 
axioms  (e.g.  the  identificatory  axioms  considered  in  this  paper)  are 
frequently  held  to  be  analytic.  For  example,  A.  J.  Ayer  [1,  p.  13],  in  his 
account  of  the  "indirect  verif  lability"  of  scientific  statements  (which  is 
similar  to  mine),  explicitly  allows  that  the  conjunctions  whose  con- 
sequences are  "directly  verifiable"  may  include  analytic  statements,  his 
reason  being  that  "while  the  statements  that  contain  [theoretical]  terms 
may  not  appear  to  describe  anything  that  anyone  could  ever  observe,  a 
'dictionary'  may  be  provided  by  means  of  which  they  can  be  transformed 
into  statements  that  are  verifiable ;  and  the  statements  which  constitute 
the  dictionary  can  be  regarded  as  analytic".  And  E.  Nagel  [9,  pp.  209f.], 
in  a  recent  discussion  of  my  book  [3],  criticises  me  for  my  "disinclination 
to  regard  as  'absolute'  Norman  Campbell's  distinction  between  the 
'hypotheses'  and  the  'dictionary'  of  a  theory.  In  Campbell's  analysis,  the 
hypotheses  postulate  just  what  relations  hold  between  the  purely  theo- 
retical but  otherwise  unspecified  terms  of  a  theory,  while  the  dictionary 
provides  the  co-ordinating  definitions  for  some  of  the  theoretical  terms  or 
for  certain  functions  of  them".  "Every  testable  theory  must  include  a 
sufficient  number  of  co-ordinating  definitions  which  are  not  subject  to 
experimental  control" ;  and,  though  Nagel  never  explicitly  says  that  co- 
ordinating definitions  state  analytic  propositions,  he  declares  that  they 
have  "the  status  of  semantic  rules"  and  contrasts  them  with  "factually 
testable  assumptions"  and  with  "genuine  hypotheses".  The  existence  of 
calculi  with  no  Campbellian  axioms  representing  "genuine  hypotheses" 
and  the  possibility  in  suitable  cases  of  converting  calculi  having  Camp- 
bellian axioms  into  calculi  with  only  identificatory  proper  axioms  make 
it  impossible  to  ascribe  a  logically  necessary  status  to  what  is  represented 
by  the  identificatory  axioms  taken  all  together.  Since  it  is  the  whole  set 


AXIOMS   IN   THE   FORM   OF   IDENTIFICATIONS  437 

of  the  hypotheses  that  conjunctively  are  "subject  to  experimental 
control",  it  is  possible  that  some  sub-set  of  them  are  not  so  subject.  But 
there  would  seem  to  be  no  good  reason  for  placing  any  of  the  identif icatory 
axioms  in  this  latter  category.  Nagel  goes  so  far  as  to  say  that  in  the 
simplest  calculus  which  I  gave  as  an  example  [3,  pp.  54ff],  in  which 
a  —  (A  r\  [t],  b  =  (^  r>  v),  c  =  (v  r\  A)  are  the  axioms  and  (a  r\  b)  C  cy 
(b  r\  c)  C  a,  (c  r»  a)  C  b  are  the  testable  theorems,  "the  obvious  (and  I 
think  correct)  alternative  to  Braithwaite's  account  is  to  construe  two  of 
the  equational  formulas  in  the  [axiom  set]  not  as  hypotheses  but  as 
having  the  function  of  semantical  rules  .  . .  which  assign  partial  meanings 
to  the  theoretical  terms  and  to  count  the  remaining  formula  as  a  genuine 
hypothesis  when  such  definitory  stipulations  have  once  been  laid  down." 
But  he  gives  no  way  ol  selecting  the  one  "genuine  hypothesis"  from  among 
the  three  which  appear  in  the  completest  symmetry. 

There  would  seem  to  be  a  stronger  case  for  regarding  Campbellian 
hypotheses  as  logically  necessary  and  for  accounting  for  the  contingency 
of  the  lowest-level  generalizations  by  the  contingency  of  the  identifications 
provided  by  identif  icatory  axioms.  The  function  of  Campbellian  axioms 
is  always  that  of  limiting  the  universe  of  the  theoretical  terms,  left  un- 
limited by  the  identificatory  axioms;  and  it  can  be  said  that,  since  the 
theoretical  scientist  in  constructing  a  theory  to  explain  his  empirical 
generalizations  has  great  liberty  of  choice  in  selecting  his  theoretical 
terms,  he  may  well  in  the  act  of  selecting  them  impose  a  limitation  upon 
the  'degrees  of  freedom'  of  their  universe  by  a  set  of  Campbellian  axioms, 
and  this  limitation  (i.e.  the  conjunction  of  the  Campbellian  axioms)  will 
never  by  itself  be  "subject  to  experimental  control",  since  it  is  concerned 
only  with  theoretical  terms.  But,  in  a  calculus  comprising  both  identi- 
ficatory and  Campbellian  axioms,  the  testable  theorems  derivable  from 
the  former  axioms  form  only  a  sub-class  of  the  testable  theorems  derivable 
from  the  conjunction  of  all  the  axioms;  so  the  Campbellian  axioms  may 
be  given  an  empirical  interpretation  by  virtue  of  the  testability  of  the 
additional  theorems  which  are  derivable  by  adding  them  to  the  identi- 
ficatory axioms.  There  is  no  adequate  reason  for  refusing  to  interpret 
every  proper  axiom  in  a  calculus  expressing  a  scientific  theory  as  repre- 
senting a  contingent  proposition,  the  empirical  interpretation  of  the 
axioms  being  given  by  the  syntactical  relations  of  the  whole  set  of  axioms 
to  the  testable  theorems  derivable  from  them.  The  only  exception  would 
be  the  uninteresting  case  in  which  a  redundant  theoretical  term  is  intro- 
duced into  a  calculus  by  an  axiom  identifying  it  with  a  logico-mathematical 


438  R.    B.    BRAITHWAITE 

function  of  other  theoretical  terms.  Such  a  sterile  axiom  ([3],  p.  113), 
functioning  merely  as  an  abbreviatory  device,  may  rightly  be  regarded  as 
analytic. 

There  is  one  other  consideration  which  has  tended  to  confuse  the  issue 
in  the  minds  of  some  scientists  and  philosophers  of  science.  If,  as  is 
usually  the  case,  the  basic  logic  of  the  scientific  theory  is  expressible  by  a 
calculus  with  axioms  and  theorems  interpreted  as  propositions  of  logic 
or  mathematics,  these  theorems  will  contain  no  extra-logical  constants; 
and  the  use  of  one  of  these  theorems  in  the  derivation  of  a  proper  theorem 
of  the  scientific  calculus  will  require  an  intermediate  step  in  which  a 
logical  theorem  is  applied  to  the  primitive  terms  concerned.  If  the  logical 
sub-calculus  uses  the  device  of  variables,  this  application  will  be  effected 
by  making  substitutions  of  primitive  terms  for  some  or  all  of  these 
variables.  The  theorem  so  derived  will  not  be  a  proper  theorem  of  the 
calculus,  since  the  primitive  terms  will  occur  in  it  only  vacuously,  but 
neither  will  it  be  a  theorem  of  the  logical  sub-calculus  since  it  will  con- 
tain primitive  terms  as  extra-logical  constants.  Call  such  a  theorem  an 
applicational  theorem.  (For  example,  the  derivation  of  (a  r\  b)  C  c  from 
a  =  (A  ^  //),  b  =  (p  r\  v),  c  =  (v  r\  X)  will  require  (if  the  basic  logic  is  ex- 
pressed as  a  Boolean  calculus)  the  use  of  the  applicational  theorem 
(ft  r»  JLI)  =  ft,  which  is  not  itself  a  theorem  of  a  Boolean  calculus  but  is 
derived  from  the  Boolean  theorem  (or  axiom)  {%  r\  %]  =  x,  where  %  is  a 
free  variable  with  class  symbols  as  substitution  values.) 

Applicational  theorems  fall  in  the  no-man's-land  between  the  theorems 
of  the  basic-logic  sub-calculus  and  the  proper  theorems  of  the  calculus. 
If  the  scientific  part  of  the  whole  calculus  is  regarded  not  (as  we  have 
thought  of  it)  as  consisting  of  proper  axioms  and  theorems  (i.e.  those 
containing  primitive  terms  non- vacuously) ,  but  as  consisting  of  all  the 
axioms  and  theorems  which  are  not  comprised  in  the  basic-logic  sub- 
calculus  (i.e.  those  which  contain  primitive  terms  either  vacuously  or 
non- vacuously),  then  the  applicational  theorems  will  be  classed  as  falling 
within  the  scientific  part.  Since  they  will  usually  function  there  as 
premisses  from  which,  together  with  the  proper  axioms,  proper  theorems 
are  derived,  and  will  not  themselves  be  derived  within  this  scientific  part, 
it  will  be  natural  to  class  them,  within  this  scientific  part,  with  the  proper 
axioms  rather  than  with  the  proper  theorems.  A  person  who  takes  this 
point  of  view  will  then  hold  that  the  scientific  part  of  the  calculus  com- 
prises axioms  which  are  to  be  interpreted  as  representing  logically 
necessary  propositions,  these  'pseudo-axioms'  being  applicational  theo- 


AXIOMS    IN    THE    FORM   OF   IDENTIFICATIONS  439 

rems  whose  interpretations  are  logically  necessary  by  virtue  of  being 
applications  to  the  concepts  concerned  of  the  laws  of  logic  or  mathematics. 
When,  as  is  usually  the  case,  the  primitive  terms  concerned  in  the  appli- 
cational  pseudo-axioms  are  theoretical  terms,  these  pseudo-axioms  will 
simulate  Campbell ian  axioms ;  and  if  the  calculus  is  one  all  of  whose  proper 
axioms  are  identificatory,  it  will  be  described  by  a  person  who  mistakes 
such  applicational  pseudo-axioms  for  Campbellian  axioms  not  as  a  calcu- 
lus with  no  Campbellian  axioms,  but  as  a  calculus  whose  Campbellian 
axioms  represent  Campbellian  hypotheses  which  are  logically  necessary. 
If  this  person  also  does  not  regard  identificatory  axioms  as  representing 
" genuine  hypotheses",  he  may  well  assert  that  all  the  "genuine  hypothe- 
ses" of  the  theory  expressed  by  the  calculus  are  logically  necessary  or  a 
priori. 

In  our  own  time  the  thesis  that  the  fundamental  laws  of  physics  are  a 
priori  has  been  maintained  by  A.  S.  Eddington,  who  has  attempted  to 
infer  them,  including  the  pure  numbers  which  occur  in  them,  from 
"epistomological  considerations"  ([7]),  p.  57).  The  reasons  Eddington 
gave  at  different  places  in  his  writings  for  his  general  thesis  are  different 
and  doubtfully  consistent,  but  his  principal  reason  would  seem  to  be  an 
argument  on  Kantian  lines  that  "the  fundamental  laws  and  constants 
of  physics . . .  are  a  consequence  of  the  conceptual  frame  of  thought  into 
which  our  observational  knowledge  is  forced  by  our  method  of  formulating 
it,  and  can  be  discovered  a  priori  by  scrutinising  the  frame  of  thought" 
([7],  p.  104).  Such  a  view  is  incompatible  with  an  empiricist  philosophy  of 
science.  But  Eddington's  programme  of  constructing  a  unified  theory  for 
physics  whose  fundamental  hypotheses  are  to  be  a  priori  appears  in  a  new 
light  if  his  goal  is  described  negatively  as  a  theory  having  no  contingent 
Campbellian  hypotheses.  For  his  goal  would  then,  on  our  way  of  thinking, 
be  a  theory  with  no  Campbellian  hypotheses  at  all,  represented  by  a 
calculus  whose  proper  axioms  were  all  identificatory;  and  we  should 
explain  his  attribution  of  apriority  to  such  a  theory  by  his  having  mistaken 
for  Campbellian  axioms  the  applicational  pseudo-axioms  required  to 
apply  the  basic  logic  to  the  concepts  of  the  theory.  And  a  programme 
of  constructing  a  Campbellian-hypothesis-free  system  of  physics,  un- 
hopeful though  it  may  appear  to  a  physicist,  is  not  ridiculous  to  an  em- 
piricist philosopher. 

Perhaps  because  Eddington  was  not  interested  in  axiomatics  this  way 
of  looking  at  his  programme  never,  it  seems,  occurred  to  him.  But 
scattered  throughout  his  writings  (e.g.  [6],  pp.  3,  242;  [7],  pp.  41,  134; 


440  R.    B.    BRAITHWAITE 

[8],  p.  265)  are  many  references  to  the  essential  part  to  be  played  by 
"identification"  and  "definition"  in  relating  observation  to  theory,  and 
he  does  not  suppose  that  these  identifications  are  a  priori:  "we  cannot 
foresee  what  will  be  the  correspondence  between  elements  in  [the]  a  priori 
physical  description  and  elements  in  our  familiar  apprehension  of  the 
universe"  [7,  p.  134].  That  Eddington's  ideal  was  a  system  with  no  Camp- 
bellianaxioms  is  suggested  by  his  preferring  the  theory  of  numbers  to 
geometry  as  an  analogue  for  a  system  of  physics:  "If  the  analogy  with 
geometry  were  to  hold  good,  there  would  be  a  limit  to  the  elimination  of 
hypothesis,  for  a  geometry  without  any  axioms  at  all  is  unthinkable. 
But .  . .  [in]  the  theory  of  numbers .  .  .  there  is  nothing  that  can  be  called 
an  axiom.  We  shall  find  reason  to  believe  that  this  is  in  closer  analogy 
with  the  system  of  fundamental  laws  of  physics"  [7,  p.  45].  So  I  think  it  is 
a  fair,  and  charitable,  gloss  on  Eddington  to  take  his  programme  as  the 
constructing  for  the  whole  of  physical  theory  of  an  identificatory  system, 
whose  axiomatization  would  comprise  only  identificatory  proper  axioms, 
in  contrast  with  the  programme  of  all  other  theoretical  physicists  of  con- 
structing Campbellian  systems,  whose  axiomatization  would  comprise 
Campbellian  as  well  as  identificatory  axioms. 

Can  anything  in  general  be  said  as  to  the  relative  advantages  of  con- 
structing Campbellian  or  identificatory  systems  as  explanatory  scientific 
theories?  Not  much  more,  I  think,  than  that,  since  the  calculus  expressing 
a  Campbellian  system  will  be  stronger  (by  virtue  of  comprising  Camp- 
bellian axioms  and  theorems)  than  a  testably  equivalent  identificatory 
calculus  (in  which  no  Campbellian  theorem  can  be  derived),  a  Campbellian 
system  can  probably  be  more  easily  adapted  in  the  future  to  explain  new 
empirical  generalizations,  as  is  illustrated  in  the  history  of  physics  by  the 
great  adaptabilitity  of  systems  which  included  the  conservation  of  energy  as 
a  Campbellian  hypothesis.  An  identificatory  system  would  seem  to  be  the 
more  appropriate  one  for  providing  the  most  economical  theory  to  ex- 
plain a  closed  set  of  empirical  generalizations.  But  it  may  well  be  the  case 
that  there  are  subjects,  perhaps  those  of  some  of  the  social  sciences,  in 
which  identificatory  systems  are  those  which  arise  most  naturally  in 
reflecting  upon  the  subject-matter  concerned.  The  development  of  the 
social  sciences  has  been  retarded  by  a  false  belief  that  numerical  mathe- 
matics provides  the  only  deductive  techniques  so  that,  to  construct  a 
scientific  theory,  it  is  necessary  that  both  the  observable  and  the  theo- 
retical concepts  of  a  science  should  be  numerically  measurable.  It  may 
also  -have  been  retarded  by  a  false  belief  that  a  science  can  only  use 


AXIOMS   IN   THE   FORM   OF   IDENTIFICATIONS  441 

theoretical  concepts  if  these  can  be  related  together  in  Campbellian 
hypotheses.  A  realization  by  social  scientists  that  there  is  no  need  to 
imitate  the  methods  of  theory-construction  which  have  proved  so  success- 
ful in  the  physical  sciences,  and  that  theories  whose  theoretical  concepts 
occur  only  in  hypotheses  'identifying'  the  observable  concepts  are  perfect- 
ly good  explanatory  theories  (provided,  of  course,  that  testable  con- 
sequences can  be  deduced  from  these  hypotheses) ,  might  encourage  them 
to  a  greater  boldness  in  thinking  up  theoretical  concepts  and  trying  out 
theories  containing  them.  This  sort  of  encouragement  is  the  contribution 
a  philosopher  of  science  can  make  the  progress  of  science. 

One  last  and  philosophical  remark.  To  identify,  by  means  of  an  identi- 
ficatory  axiom,  an  observable  term  with  a  logico-mathematical  function 
of  theoretical  terms  in  a  calculus  expressing  a  scientific  theory  is  one  way 
of  explicating  (in  the  sense  of  R.  Carnap  [5,  Chapter  I])  the  "inexact 
concept"  for  which  the  observable  term  stands  in  ordinary  language.  To 
propose  a  scientific  theory  containing  theoretical  concepts  which  is  to  be 
testable  against  experience  involving  inexact  concepts  requires  expli- 
cations of  these  concepts;  and,  if  the  theory  is  an  identificatory  system, 
the  hypotheses  of  the  theory  will  consist  entirely  of  such  explications. 
Conversely,  a  set  of  explications  by  means  of  theoretical  concepts  will 
constitute  the  hypotheses  of  an  identificatory  system ;  and,  if  this  system 
permits  the  deduction  of  empirically  testable  consequences,  it  will  be  a 
scientific  theory.  A  philosopher  propounding  such  a  system  of  explications 
must  not  be  dismissed  as  a  rationalist  metaphysician  on  the  sole  ground 
that  the  hypotheses  of  his  system  appear  all  in  the  form  of  new  'deli- 
nitions'.  His  system  will  only  fail  to  be  scientific  if  nothing  empirical 
follows  from  all  his  definitions  taken  together. 


Bibliography 

[1]    AVER,  A.  J.,  Language  Truth  and  Logic  (2nd  edition).  London  1946,  160  pp. 
[2]    BETH,  E    W.,  On  Padoa's  method  in  the  theory  of  definition.  Indagationes 

Mathcmaticae,  vol.  15  (1953),  pp.  330-339. 

[3]    BRAITHWAITE,  K.  B.,  Scientific  Explanation.  Cambridge  1953,  X  -f  376  pp. 
[4]    CAMPBELL,  N.  K.,  Physics  The  Elements.  Cambridge  1920,  X  -f  565  pp. 
[5]    CARNAP,  R.,  Logical  Foundations  of  Probability.  Chicago  1950,  XVIII  -f  607 

pp. 
[6]    EDDINGTON,  (Sir)  A.  S.,  Relativity  Theory  of  Protons  and  Electrons.  Cambridge 

1936,  VIII  -f  336  pp. 


442  R.    B.    BRAITHWAITE 

[7]    ,  The  Philosophy  of  Physical  Science.  Cambridge  1939,  X  +  230  pp. 

[8]    ,  Fundamental  Theory.  Cambridge  1946,  VIII  -f  292  pp. 

[9]    NAGEL,  E.,  A  budget  of  problems  in  the  philosophy  of  science.  The  Philosophical 

Review,  vol.  66  (1957),  pp.  205-225. 

[10]    TARSKI,  A.,  Some  methodological  investigations  on  the  definability  of  concepts. 
Chapter  X  in  Logic,  Semantics,  Metamathematics.  Oxford  1956,  XIV  -f-  471  pp. 
[11]    WHITEHEAD,  A.  N.,  Universal  Algebra,  vol.  1,  Cambridge  1898,  XXVI  +  586 
pp. 


Symposium  on  the  Axiomatic  Method 


DEFINABLE  TERMS  AND  PRIMITIVES  IN  AXIOM  SYSTEMS 

HERBERT  A.  SIMON 

Carnegie  Institute  of  Technology  Pittsburgh,  Pennsylvania,  U.S.A. 

An  axiom  system  may  be  constructed  for  a  theory  of  empirical  phe- 
nomena with  any  of  a  number  of  goals  in  mind.  Some  of  these  goals  are 
identical  with  those  that  motivate  the  axiomatization  of  mathematical 
theories,  hence  relate  only  to  the  formal  structure  of  the  theory  —  its 
syntax.  Other  goals  for  axiomatizing  scientific  theories  relate  to  the 
problems  of  verifying  the  theories  empirically,  hence  incorporate 
semantic  considerations. 

An  axiom  system  includes,  on  the  one  hand,  entities  like  primitive 
terms,  defined  terms,  and  definitions,  and  on  the  other  hand,  entities  like 
axioms,  theorems,  and  proofs.  Tarski  [10,  p.  296]  has  emphasized  the 
parallelism  between  the  first  triplet  of  terms  and  the  second.  The  usual 
goals  for  axiomatizing  deductive  systems  are  to  insure  that  neither  more 
nor  less  is  posited  by  way  of  primitive  terms  and  axioms  than  is  necessary 
and  sufficient  for  the  formal  correctness  of  the  definitions  and  proofs,  and 
hence  the  derivability  of  the  defined  terms  and  theorems.  An  axiom  sys- 
tem is  usually  accompanied  by  proofs  of  the  independence,  consistency, 
and  completeness  of  its  axioms;  and  presumably  should  also  be  ac- 
companied —  although  it  less  often  is  —  by  proofs  of  the  independence, 
consistency,  and  completeness  of  its  primitive  terms. 

Frequently  a  set  of  sentences  (axioms  and  theorems)  and  terms  admits 
alternative  equivalent  axiom  systems:  that  is  non-identical  partitionings 
of  the  sentences  into  axioms  and  theorems,  respectively ;  and  of  the  terms 
into  primitive  and  defined  terms.  Hence,  a  particular  set  of  axioms  and 
primitive  terms  may  be  thought  of  as  a  (not  necessarily  unique)  basis  for 
a  class  of  equivalent  axiom  systems. 

In  constructing  an  axiom  system  for  an  empirical  theory,  we  may 
wish  to  distinguish  sentences  that  can  be  confronted  more  or  less  directly 
with  evidence  (e.g.,  "the  temperature  of  this  water  is  104°")  from  other 
sentences.  We  may  wish  to  make  a  similar  distinction  between  predicates, 
functors,  and  other  terms  that  appear  in  such  sentences  (e.g.,  "temper- 
ature") and  those  that  do  not..  The  terms  "observation  sentences"  and 

443 


444  HERBERT   A.    SIMON 

"observables"  are  often  iused  to  refer  to  such  sentences  and  such  terms, 
respectively.  l 

The  distinction  between  observables  and  non-observables  is  useful  in 
determining  how  fully  the  sentences  of  a  theory  can  be  confirmed  or  dis- 
confirmed  by  empirical  evidence,  and  to  what  extent  the  terms  of  the 
theory  are  operationally  defined.  In  addition  to  the  formal  requirements, 
discussed  previously,  we  might  wish  to  impose  the  following  additional 
conditions  on  an  axiom  system  for  an  empirical  theory: 

( 1 )  that  the  entire  system  be  factorable  into  a  subsystem  that  is  equi- 
valent to  some  axiom  system  for  a  part  of  logic  and  mathematics,  and  a 
remainder ; 

(2)  that  in  the  remainder,  axioms  correspond  to  observation  sentences, 
and  primitive  terms  to  observables. 

Condition  (2)  is,  of  course,  a  semantic  rather  than  a  syntactic  condition, 
and  has  no  counterpart  in  the  axiomatization  of  mathematical  theories. 
The  usefulness  of  the  condition  is  that,  if  it  is  met,  the  empirical  testability 
of  observation  sentences  guarantees  the  testability  of  all  the  sentences  in 
the  system,  and  the  operational  definability  of  observables  guarantees  the 
operationality  of  all  the  terms.  In  the  remainder  of  this  paper  we  shall 
explore  some  problems  that  arise  in  trying  to  satisfy  Condition  (2),  and 
some  modifications  in  the  notion  of  definability  —  as  that  term  is  used  in 
formal  systems  —  that  are  needed  to  solve  these  problems. 

The  question  of  what  characteristics  an  axiom  system  should  possess 
has  been  raised  in  the  past  few  years  [9]  in  connection  with  the  definability 
of  mass  in  Newtonian  mechanics.  In  one  recent  axiomatization  of  New- 
tonian particle  mechanics  [5]  particular  care  is  taken  to  meet  the  syntactic 
conditions  for  a  satisfactory  axiomatization,  and  mass  is  introduced  as  a 
primitive  term.  In  another  axiomatization  [8]  special  attention  is  paid  to 
semantic  questions,  and  definitory  equations  for  mass  are  introduced. 

Definability  and  Generic  Definability.  Tarski  [10]  has  proposed  a 
definition  of  the  term  definability  in  a  deductive  system,  and  has  shown 
how  this  definition  provides  a  theoretical  foundation  for  the  method 
employed  by  Padoa  [6]  to  establish  whether  particular  terms  in  a  system 
are  definable  or  primitive.  In  their  axiomatization  of  classical  particle 
mechanics,  McKinscy,  Sugar  and  Suppes  [5,  Paragraph  5]  employ  the 
method  of  Padoa  to  show  that,  by  Tarski 's  definition,  mass  and  force  are 

1  For  a  more  extended  discussion  of  these  terms,  see  [2,  pp.  454—456]. 


DEFINABLE  TERMS  AND  PRIMITIVES  IN  AXIOM  SYSTEMS     445 

primitive  terms  in  their  system.  Application  of  the  same  method  to  Si- 
mon's earlier  axiomatization  of  Newtonian  mechanics  [8]  gives  the  same 
result  —  mass  and  force  are  primitives  in  that  system. 

The  latter  result  appears  to  conflict  with  common-sense  notions  of 
definability,  since  in  [8]  the  masses  of  the  particles  can  (in  general)  be 
computed  when  their  positions  and  accelerations  are  known  at  several 
points  in  time  [8,  Theorem  I].  Condition  (2)  of  the  previous  section  is 
violated  if  masses,  which  are  not  observables,  are  taken  as  primitive 
terms;  and  it  appears  paradoxical  that  it  should  be  possible  to  calculate 
the  masses  when  they  are  neither  observables  nor  defined  terms.  These 
difficulties  suggest  that  Tarski's  concept  of  definability  is  not  the  most 
satisfactory  one  to  use  in  the  axiomatization  of  empirical  science. 

A  closer  examination  of  the  situation,  for  [8],  shows  that  the  masses  are 
not  uniquely  determined  in  certain  situations  that  are  best  regarded  as 
special  cases  —  e.g.,  the  case  of  a  single  unaccelerated  particle.  It  is  by 
the  construction  of  such  special  cases,  and  the  application  of  the  method 
of  Padoa  to  them,  that  McKinsey,  Sugar  and  Suppes  show  mass  to  be  a 
primitive  in  [5],  and  by  inference  in  [8].  But  I  shall  show  that  if  the  defi- 
nition of  Tarski  is  weakened  in  an  appropriate  way  to  eliminate  these 
special  cases  it  no  longer  provides  a  justification  for  the  method  of  Padoa, 
but  does  provide  a  better  explication  of  the  common-sense  notion  of 
definability. 

Statement  of  the  Problem.  We  shall  discuss  the  problem  here  in  an 
informal  manner.  The  treatment  can  easily  be  formalized  along  the  lines 
of  Tarski's  paper.  2  In  Tarski's  terms  [10,  p.  299],  the  formula  (f>(x ;  b' ',  b" ', . . ) 
defines  the  extra-logical  constant  a  if,  for  every  x,  %  satisfies  </>  if  and  only 
if  x  is  identical  with  a\  i.e.,  if: 

(I)  (x):x  =  a.^.<l>(x;b',b",  ...), 

where  x  is  the  only  real  variable  in  <f>,  and  b',  b",  ...  are  the  members  of  a  set 
of  extra-logical  constants  (primitives  andl or  defined  terms] . 

Translated  into  these  terms,  the  (attempted)  definition  of  "the  mass  of 
particle  i"  in  [8,  p.  892]  proceeds  thus:  (1)  We  take  as  the  function  <£  the 
conjunction  of  the  six  scalar  equations  that  state  the  laws  of  conservation 
of  momentum  and  conservation  of  angular  momentum  for  a  system  of 
particles.  (2)  We  take  as  the  set  B  the  paths  of  the  particles  in  some  time 

2  Compare  also  [2,  p.  439]. 


446  HERBERT  A.    SIMON 

interval.  (3)  We  take  as  x  the  set  of  numbers  m^  that  satisfy  </>  for  the 
given  B. 

This  procedure  does  not  satisfy  Tarski's  definition  since  the  existence 
and  uniqueness  of  the  masses  is  not  guaranteed.  For  example,  in  the  case 
of  a  single,  unaccelerated  particle,  any  number,  m,  substituted  in  the 
equations  for  conservation  of  momentum  and  angular  momentum  will 
satisfy  those  equations.  But  Tarski  shows  (his  Theorem  2)  that  if  two 
constants  satisfy  a  definitory  formula  for  a  particular  set,  Bt  they  must 
be  identical. 

Generic  Definition.  To  remove  the  difficulty,  we  replace  Tarski's 
definition  with  a  weaker  one:  the  formula  <£(#;  b',  6",  . . .)  DEFINES 
GENERICALLY  the  extralogical  constant  a  if,  for  every  x,  if  x  is  identical  with, 
a,  x  satisfies  <f>: 

(I')  (x):x  =  0.D.  </»(*;&',&",  . .  .)• 

After  the  equivalence  symbol  in  formula  (I)  has  been  replaced  by  an 
implication  in  this  way,  the  three  theorems  of  Tarski's  paper  are  no 
longer  provable.  In  particular,  formula  (7)  in  his  proof  of  Theorem  I  [10, 
pp.  301-302]  can  no  longer  be  derived  from  the  modified  forms  of  his 
formulas  (3)  and  (6).  Hence,  the  method  of  Padoa  cannot  be  used  to 
disqualify  a  proposed  generic  definition. 

It  is  easy  to  show  that  in  [8]  mass  is  generically  defined  by  means  of  the 
paths  of  the  particles  on  the  basis  of  the  Third  Law  of  Motion  (more 
exactly,  the  laws  of  conservation  of  momentum  and  angular  momentum) ; 
and  that  resultant  force  is  generically  defined  by  means  of  the  paths  of 
the  particles  and  their  masses  on  the  basis  of  the  Third  and  Second  Laws 
of  Motion  [8,  p.  901].  Similarly,  we  can  show  that  in  [5,  p.  258]  resultant 
force  is  generically  defined  by  means  of  the  paths  of  the  particles  and  their 
masses  on  the  basis  of  the  Second  Law  of  Motion. 

The  advantage  of  substituting  generic  definition  for  definition  is  that, 
often,  a  constant  is  not  uniquely  determined  for  all  possible  values  of  the 
other  extra-logical  constants,  but  experimental  or  observational  circum- 
stances can  be  devised  that  do  guarantee  for  those  circumstances  the 
unique  determination  of  the  constant. 

In  the  axiom  system  of  [8],  for  example,  the  conditions  under  which 
masses  exist  for  a  system  of  particles  and  the  conditions  under  which 
these  masses  are  unique  have  reasonable  physical  interpretations.  The 
observables  are  the  space-time  coordinates  of  the  particles.  From  a 


DEFINABLE  TERMS  AND  PRIMITIVES  IN  AXIOM  SYSTEMS     447 

physical  standpoint,  we  would  expect  masses  (not  necessarily  unique)  to 
be  calculable  from  the  motion  of  a  set  of  particles,  using  the  principles 
of  conservation  of  momentum  and  angular  momentum,  whenever  this 
set  of  particles  was  physically  isolated  from  other  particles.  Moreover, 
we  would  expect  the  relative  masses  to  be  uniquely  determined  whenever 
there  was  no  proper  subset  of  particles  that  was  physically  isolated  from 
the  rest.  These  are  precisely  the  conditions  for  existence  (Definition  3) 
and  uniqueness  (Theorem  1  and  Definition  6)  of  the  masses  in  this  axio- 
matization.  Thus,  the  definition  of  mass  in  [8]  does  not  lead  to  a  unique 
determination  of  the  mass  of  a  single  star  at  a  great  distance  from  other 
stars,  but  does  permit  the  calculation,  uniquely  up  to  a  factor  of  pro- 
portionality, of  the  masses  of  the  members  of  the  solar  system  from  obser- 
vation of  their  paths  alone,  and  without  postulating  a  particular  force 
law  [8,  pp.  900-901]. 

OTHER  CONCEPTS  OF  DEFINABILITY 

The  sharp  distinctions  between  axioms  and  theorems,  and  between 
primitive  and  defined  terms  have  proved  useful  dichotomies  in  axio- 
matizing  deductive  systems.  We  have  seen  that  difficulties  arise  in  pre- 
serving the  latter  distinction  in  empirical  systems,  when  the  axiom  system 
is  required  to  meet  Condition  (2)  —  when  primitive  terms  are  identified 
with  observables.  But  it  has  long  been  recognized  that  comparable 
difficulties  arise  from  the  other  half  of  Condition  (2),  that  is,  from  the 
identification  of  axioms  with  observation  sentences.  In  our  axiomatization 
of  Newtonian  mechanics,  for  example,  the  law  of  conservation  of  momen- 
tum, applied  to  an  isolated  system  of  particles,  is  an  identity  in  time 
containing  only  a  finite  number  of  parameters  (the  masses).  If  time  is 
assumed  to  be  a  continuous  variable,  this  law  comprises  a  nondenumer- 
able  infinity  of  observation  sentences.  Hence,  the  law  is  not  itself  an 
observation  sentence  nor  is  it  derivable  from  a  finite  set  of  observation 
sentences. 

The  two  difficulties  —  that  with  respect  to  axioms  and  that  with  respect 
to  primitives  —  arise  from  analogous  asymmetries.  In  a  system  of  New- 
tonian mechanics,  given  the  initial  conditions  and  masses  of  a  system  of 
particles,  we  can  deduce  univocally  their  paths.  Given  their  paths,  we 
may  or  may  not  be  able  to  derive  unique  values  for  the  masses.  Given  the 
the  laws  and  values  of  the  generically  defined  primitives,  we  can  deduce 
observation  sentences;  given  any  finite  set  of  observation  sentences,  we 


448  HERBERT   A.    SIMON 

cannot  generally  deduce  laws.  When  the  matter  is  put  in  this  way,  the 
asymmetry  is  not  surprising,  and  it  is  easy  to  see  that  the  thesis  of  naive 
logical  positivism  —  essentially  the  thesis  of  Condition  (2)  —  is  untenable 
unless  it  is  weakened  substantially. 

Contextual  Definitions,  Implicit  Definitions  and  Reduction  Sentences. 

Revisions  of  the  concept  of  definition  similar  in  aim  to  that  discussed  here 
have  been  proposed  by  a  number  of  empiricists.  Quine's  [7,  p.  42]  notion  of 
contextual  definition,  while  nowhere  spelled  out  formally,  is  an  example: 

The  idea  of  defining  a  symbol  in  use  was,  as  remarked,  an  advance  over  the 
impossible  term-by-term  empiricism  of  Locke  and  Hume.  The  statement, 
rather  than  the  term,  came  with  Frcge  to  be  recognized  as  the  unit  accountable 
to  an  empiricist  critique.  But  what  I  am  now  urging  is  that  even  in  taking  the 
statement  as  unit  we  have  drawn  our  grid  too  finely.  The  unit  of  empirical 
significance  is  the  whole  of  science. 

Braithwaite  [1]  carries  the  argument  a  step  further  by  pointing  out 
advantages  of  having  in  an  empirical  theory  certain  terms  that  are  not 
uniquely  determined  by  observations.  His  discussion  of  this  point  [1,  pp. 
76-77]  is  worth  quoting : 

We  can,  however,  extend  the  sense  of  definition  if  we  wish  to  do  so.  In  explicit 
definition,  which  we  have  so  far  considred,  the  possibilities  of  interpreting  a 
certain  symbol  occurring  in  a  calculus  are  reduced  to  one  possibility  by  the 
requirement  that  the  symbol  should  be  synonymous  (within  the  calculus)  with 
a  symbol  or  combination  of  symbols  which  have  already  been  given  an  inter- 
pretation. But  the  possibilities  of  interpreting  a  certain  symbol  occurring  in  a 
calculus  may  be  reduced  without  being  reduced  to  only  one  possibility  by  the 
interpretation  already  given  of  other  symbols  occurring  in  the  formulae  in  the 
calculus.  If  we  wish  to  stress  the  resemblance  between  the  reduction  of  the 
possibilities  of  interpreting  a  symbol  to  only  one  possibility  and  the  reduction 
of  these  possibilities  but  not  to  only  one  possibility,  instead  of  wishing  to  stress 
(as  we  have  so  far  stressed)  the  difference  between  these  two  soits  of  reduction, 
we  shall  call  the  second  reduction  as  well  as  the  first  by  the  name  of  definition, 
qualifying  the  noun  by  such  words  as  "implicit"  or  "by  postulate."  With  this 
extension  of  the  meaning  of  definition  the  thesis  of  this  chapter  can  be  ex- 
pressed by  saying  that,  while  the  theoretical  terms  of  a  scientific  theory  are 
implicitly  defined  by  their  occurence  in  initial  formulae  in  a  calculus  in  which 
there  are  derived  formulae  interpreted  as  empirical  generalizations,  the  theo- 
retical terms  cannot  be  explicitly  defined  by  means  of  the  interpretations  of  the 
terms  in  these  derived  formulae  without  the  theory  thereby  becoming  in- 
capable of  growth. 


DEFINABLE  TERMS  AND  PRIMITIVES  IN  AXIOM  SYSTEMS     449 

As  a  final  parallel,  I  will  mention  Carnap's  concept  of  reduction  sentence 
in  his  essay  on  Testability  and  Meaning  [2,  p.  442].  A  reduction  sentence 
for  £3  is  a  sentence  of  the  form,  Q%  D  (Q\  D  $3),  where  Q%  is  interpreted  as 
the  set  of  conditions  under  which  the  subsidiary  implication  holds,  and 
where  Qi  is  interpreted  as  a  (partial)  definiens  for  Q&  Thus,  let  Q%  be  the 
statement  that  a  set  of  particles  is  isolated;  Q\  be  the  statement  that  a 
certain  vector,  m,  substituted  for  the  coefficients  in  the  equations  stating 
the  laws  of  conservation  of  momentum  and  angular  momentum  for  the 
particles,  satisfies  those  equations;  and  Q$  be  the  statement  that  the 
components  of  m  are  masses  of  the  particles.  Then  Q%  D  (Qi  D  $3)  is 
essentially  identical  with  the  definition  of  mass  in  [8].  The  subsidiary 
connective  is  an  implication  rather  than  an  equivalence  because  there  is 
no  guarantee  that  another  vector,  m' ,  may  not  also  constitute  a  satis- 
factory set  of  masses,  so  that  Q%  D  (Q\  D  (V),  where  Qi  is  derived  from 
Q\,  and  Q$  from  Q$  by  substituting  mf,  for  m. 

Definability  Almost  Everywhere.  In  preference  to  either  definability  or 
generic  definability,  we  might  want  to  have  a  term  midway  in  strength 
between  these  two  —  a  notion  of  definability  that  would  guarantee  that 
we  could  "usually"  determine  the  defined  term  uni vocally,  and  that  the 
cases  in  which  we  could  not  would  be  in  some  sense  exceptional.  Under 
certain  conditions  it  is,  in  fact,  possible  to  introduce  such  a  term.  Suppose 
that  B  is  a  point  in  some  space  possessing  a  measure,  and  let  there  be  a 
sentence  of  form  (I)  that  holds  almost  everywhere  in  the  space  of  B.  Then,  we 
say  that  a  is  DEFINED  ALMOST  EVERYWHERE. 

If,  in  [8],  we  take  B  as  the  time  path  of  the  system  which  satisfies  the 
axioms  in  some  interval  k  <  t  <  m,  and  take  the  Lebesgue  measure  in 
the  appropriate  function  space  for  the  B's  as  the  measure  function,  then 
mass  is  defined  almost  everywhere,  as  is  resultant  force. 

DEFINABILITY  AND  IDENTIFIABILITY 

It  has  not  generally  been  noted  that  the  problem  of  definability  of  non- 
observables  in  axiomatizations  of  empirical  theories  is  identical  with  what 
has  been  termed  the  "identification  problem"  in  the  literature  of  mathe- 
matical statistics  [4,  p.  70;  9,  pp.  341-342].  The  identification  problem  is 
the  problem  of  estimating  the  parameters  that  appear  in  a  system  of 
equations  from  observations  of  the  values  of  the  variables  in  the  same 
system  of  equations. 


450  HERBERT  A.    SIMON 

Some  Types  of  Identifiability  Problems.  Consider,  for  example,  a  system 
of  linear  equations: 

(1) 


where  the  x's  are  observables  and  the  a's  and  b's  are  parameters.  The 
a's  and  b's  are  generically  defined  by  this  system  of  equations,  but  they 
are  not  defined  in  Tarski's  sense,  for,  no  matter  how  many  sets  of  obser- 
vations of  the  x's  we  have,  the  a's  and  b's  are  not  uniquely  determined. 
For  suppose  that  A  and  b  are  a  matrix  and  vector,  respectively,  that 
satisfy  (1)  for  the  observed  x's.  3  Then  A'  and  b'  will  also  satisfy  (1),  where 
A'  —  PA  and  b'  =  Pb  for  any  non-singular  matrix  P.  To  identify  the  a's 
and  6's  —  that  is,  to  make  it  possible  to  estimate  them  uniquely  —  ad- 
ditional constraints  beyond  those  embodied  in  equations  (1)  must  be 
introduced. 

On  the  other  hand,  consider  the  system  of  linear  difference  equations: 

(2)  2  «tf**W  =  **('  +  1).   (*'=  1,  ...,«) 


where,  as  before,  the  x's  are  observables,  and  the  a's  and  b's  constant 
parameters.  In  this  case,  the  a's  are  defined  almost  everywhere  in  the 
space  of  x(t).  There  are  n2  parameters  to  be  estimated,  and  the  number 
of  equations  of  form  (2)  available  for  estimating  them  is  n(k  —  1),  where 
k  is  the  number  of  points  in  time  at  which  the  x's  are  observed.  Hence,  for 
almost  all  paths  of  the  system,  and  for  k  >  n  -f-  1,  the  a's  will  be  de- 
termined uniquely.  4 

We  see  that  the  system  of  equations  (2)  is  quite  analogous  to  the  system 
of  equations  used  in  [8]  to  define  mass.  In  the  latter  system,  for  n  particles, 
having  3n  position  coordinates,  there  arc  6  second  order  differential 
equations  (three  for  conservation  of  momentum,  three  for  conservation 
of  angular  momentum)  that  are  homogeneous  in  the  m's,  and  that  must 
hold  identically  in  t.  There  are  (n  —  1  )  parameters  to  be  estimated  —  the 
number  of  mass-ratios  of  the  particles,  referred  to  a  particular  one  of 
them  as  unit.  Hence,  for  almost  all  paths  of  the  system,  the  mass-ratios 

3  In  this  entire  discussion,  we  are  disregarding  errors  of  observation  and  the  fact 
that  the  equations  may  be  only  approximately  satisfied.  For  an  analysis  that  takes 
into  account  these  additional  complications,  the  reader  must  refer  to  [3]  and  [4], 

4  The  convenience  of  replacing  identifiability  (equivalent  to  Tarski's  definability) 
by  almost-everywhere  identifiability  (equivalent  to  almost  -every  where  definability) 
has  already  been  noted  in  the  literature  on  the  identification  problem  [4,  p.  82;  3, 
p.  53]. 


DEFINABLE  TERMS  AND  PRIMITIVES  IN  AXIOM  SYSTEMS     451 

can  be  estimated  uniquely  from  observations  of  the  positions  of  the 
particles  at  f \-  2J  points  in  time. 

Correspondingly,  the  system  of  equations  (1)  is  analogous  to  the 
system  of  equations  used  in  [8,  p.  901]  to  define  the  component  forces 
between  pairs  of  particles.  Component  forces  are  only  generically  defined. 
Hence,  although  the  masses  of  particles  in  a  system  and  the  resultant 
forces  acting  upon  them  can,  in  general,  be  estimated  if  there  is  a  sufficient 
number  of  observations  of  the  positions  of  the  particles;  the  component 
forces  cannot  be  so  estimated  unless  additional  identifying  assumptions 
are  introduced.  Such  additional  assumptions  might,  for  example,  take 
the  form  of  a  particular  force  law,  like  the  inverse  square  law  of  gravi- 
tational attraction. 

Over-Identification  and  Testability.  When  a  scientific  theory  is  axio- 
matized  with  a  view  to  clarifying  the  problems  of  testing  the  theory,  a 
number  of  considerations  are  present  that  do  not  appear  in  axiomatizing 
deductive  systems.  Hence,  it  may  be  undesirable  to  imitate  too  closely 
the  canons  usually  prescribed  for  the  latter  type  of  axiomatization.  In 
addition  to  distinguishing  primitive  from  defined  terms,  it  may  be 
advantageous  to  subdivide  the  former  class  as  so  to  distinguish  terms  that 
are  defined  almost  everywhere  or  that  are  only  generically  defined. 

More  fundamentally,  whether  particular  terms  are  univocally  deter- 
mined by  the  system  will  depend  not  only  on  the  specific  sentences  that 
have  the  form  of  definitions  of  these  terms,  but  upon  the  whole  set  of 
sentences  of  the  system.  Our  analysis  of  an  actual  axiom  system  for 
Newtonian  particle  mechanics  bears  out  the  contentions  of  Braithwaite 
and  Quine  that  the  definitions  of  non-observables  often  are,  and  must 
be,  "implicit"  or  "contextual." 

What  does  the  analysis  suggest,  on  the  positive  side,  as  a  substitute  for 
the  too  strict  Condition  (2)  ?  In  general,  there  will  appear  in  an  axiom 
system  terms  that  are  direct  observables,  and  terms  that  are  not.  A 
minimum  requirement  from  the  standpoint  of  empiricism  is  that  the 
system  as  a  whole  be  over-identified:  that  there  be  possible  sets  of 
observations  that  would  be  inconsistent,  collectively,  with  the  sentences 
of  the  system.  We  have  seen  that  this  condition  by  no  means  guarantees 
that  all  the  non-observables  of  the  system  will  be  defined  terms,  or  even 
defined  almost-e  very  where. 

A  more  radical  empiricism  would  require  that  it  be  possible,  by  making 


452  HERBERT  A.    SIMON 

a  sufficient  number  of  observations,  to  determine  uniquely  the  values  of  all 
parameters  that  appear  in  the  system.  To  take  a  simple  example,  a  strict 
interpretation  of  this  condition  would  not  permit  masses  to  appear  in  the 
axiomatization  of  Newtonian  mechanics,  but  only  mass-ratios.  Resultant 
forces  would  be  admissible,  but  not  component  forces,  unless  sufficient 
postulates  were  added  about  the  form  of  the  force  law  to  overdetermine 
them.  We  may  borrow  Quine's  phrase  for  this  requirement,  and  say  that 
when  it  is  satisfied  for  some  set  of  terms,  the  terms  are  defined  contextually 
by  the  system.  5  The  condition  that  all  non-observables  be  defined  con- 
textually is  still  much  weaker,  of  course,  than  the  condition  that  they  be 
defined. 

For  reasons  of  elegance,  we  may  sometimes  wish  to  stop  a  little  short 
of  insisting  that  all  terms  in  a  system  be  defined  contextually.  We  have 
already  mentioned  a  suitable  example  of  this.  In  [8]  mass  ratios  are  defined 
almost  everywhere,  but  masses  are  not  defined  contextually,  even  in  an 
almost-everywhere  sense.  Still,  we  would  probably  prefer  the  symmetry  of 
associating  a  mass  number  with  each  particle  to  a  formulation  that 
arbitrarily  selected  one  of  these  masses  as  a  numeraire. 

Braithwaite  has  given  us  another  reason,  from  the  semantic  side,  for 
not  insisting  on  contextual  definition  of  all  terms.  He  observes  that  if  we 
leave  some  degrees  of  freedom  in  the  system,  this  freedom  allows  us  later 
to  add  additional  axioms  to  the  system,  without  introducing  internal 
inconsistencies,  when  we  have  reason  to  do  so.  Thus,  since  the  law  of 
conservation  of  energy  does  not  determine  the  zero  of  the  temperature 
scale,  the  zero  may  be  fixed  subsequently  by  means  of  the  gas  laws. 

Regardless  of  what  position  we  take  on  empiricism  in  axiomatizing 
scientific  theories,  it  would  be  desirable  to  provide  for  any  axiom  system 
theorems  characterizing  not  only  its  syntactical  properties  (e.g.,  the 
independence,  consistency,  and  completeness  of  the  axioms),  but  its 
semantic  properties  (e.g.,  the  degree  of  identifiability  of  its  non-observa- 
bles) as  well. 


5  Braithwaite's  "implicit  definition"  will  not  do  here,  for  he  applies  it  specifically 
to  the  weaker  condition  of  the  previous  paragraph. 


DEFINABLE  TERMS  AND  PRIMITIVES  IN  AXIOM  SYSTEMS     453 

Bibliography 

[1]    BRAITHWAITE,  R.  B.,  Scientific  Explanation.  Cambridge  1955,  XII  -f  376  pp. 
[2]    CARNAP,  R.,  Testability  and  meaning.  Philosophy  of  Science,  vol.  3  (1936),  pp. 

419-471,  and  vol.  4  (1937),  pp.  1-40. 
[3]    HOOD,  W..  and  T.  C.  KOOPMANS  (eds.)  Studies  in  Econometric  Method.  New 

York  1953,  XIX  +  323  pp. 
[4]    KOOPMANS,  T.  C.  (ed.),  Statistical  Inference  in  Dynamic  Economic  Models. 

New  York  1950,  XIV  -f  439  pp. 
[5]    McKiNSEY,  J.  C.  C.,  A.  C.  SUGAR  and  P.  SUPPES,  Axiomatic  foundations  of 

classical  particle  mechanics.  Journal  of  Rational  Mechanics  and  Analysis,  vol. 

2  (1953),  pp.  253-272. 
[6]    PADOA,  A.,  Essai  d'une  Movie  algdbrique  des  nombres  entiers,  precedt  d'une 

introduction  logique  a  une  theorie  deductive  quelconque.  Bibliotheque  du  Con- 

gres  International  de  Philosophic,  vol.  3  (1900). 
[7]    QUINE,  W.,  From  aLogical  Point  of  View.  Cambridge  (Mass.)  1953,  VI  -f-  184 

pp. 
[8]     SIMON,  H.  A.,  The  axioms  of  Newtonian  mechanics.  Philosophical  Magazine, 

sen  7,  vol.  33  (1947),  pp.  888-905. 
[9]    ,  Discussion',  the  axiomatization  of  classical  mechanics.  Philosophy  of 

Science,  vol.  21  (1954),  pp.  340-343. 
[10]    TARSKI,  A.,  Some  methodological  investigations  on  the  definability  of  concepts. 

Chapter  10  in  Logic,  Semantics,  Metamathematics,  Oxford  1956,  XIV  -f  467  pp 


Symposium  on  the  Axiomatic  Method 


AN  AXIOMATIC  THEORY  OF  FUNCTIONS  AND  FLUENTS 

KARL  MENGER 

Illinois  Institute  of  Technology,  Chicago,  Illinois,  U.S.A. 

The  topic  of  this  paper  is  a  theory  of  some  basic  applications  of  mathe- 
matics to  science.  Part  I  deals  with  concepts  of  pure  mathematics  such 
as  the  logarithm,  the  second  power,  and  the  product,  and  with  substi- 
tutions in  the  realm  of  those  functions.  Part  II  is  devoted  to  scientific 
material  such  as  time,  gas  pressure,  coordinates  —  objects  that  Newton 
called  fluents.  Part  III  formulates  articulate  rules  for  the  interrelation  of 
fluents  by  functions.  Properly  relativized,  the  latter  play  that  connective 
role  for  which  Leibniz  originated  the  term  function. 

I.  FUNCTIONS 

Explicitly,  a  real  function  with  a  real  domain  —  briefly,  a  function  — 
may  be  defined  as  a  class  of  consistent  ordered  pairs  of  real  numbers. 
Here  and  in  the  sequel,  two  ordered  pairs  of  any  kind  are  called  con- 
sistent unless  their  first  members  are  equal  while  their  second  members 
are  unequal.  If  each  pair  e  /i  (that  is,  belonging  to  the  function  /i)  is  also 
e  /2  —  in  symbols,  if  /i  C  /2  —  then  /i  is  called  a  restriction  of  /2 ;  and  /a  an 
extension  of  f\.  The  empty  function  (including  no  pair)  will  be  denoted 
by  0.  The  class  of  the  first  (the  second)  members  of  all  pairs  e  /  is  called 
the  domain  of  /  or  dom  /  (the  range  of  /  or  ran  /) .  If  ran  /  includes  exactly 
one  number,  then  /  is  said  to  be  a  constant  function. 

The  following  typographical  convention  *  will  be  strictly  adhered  to: 
roman  type  for  numbers;  italic  type  for  functions. 

For  instance,  the  logarithmic  function  —  briefly,  log  —  is  the  class  of  all  pairs 
(a,  log  a)  for  any  a  >  0.  The  constant  function  consisting  of  all  pairs  (x,  0)  for 
any  x  will  be  denoted  2  by  O.  The  following  arc  examples  of  a  formula  and  a 
general  statement,  respectively:  log  e  =  1,  and  0  C  /  for  any  /.  Here,  0,  1,  e  as 

1  Cf.  Monger  [10]  referred  to  in  the  sequel  as  Calculus. 

2  Symbols  for  constant  functions  that  are  more  elaborate  than  italicized  nu- 
merals, such  as  Ci  and  CQ,  must  be  used  in  order  to  express  certain  laws;  e.g.,  that 
CQ+I  '=  c0  +  c\. 

454 


AN  AXIOMATIC  THEORY  OF  FUNCTIONS  AND  FLUENTS         455 

well  as  log,  O,  and  0  are  designations  of  specific  entities,  while  a,  x,  and  /  are 
variables  (i.e.,  symbols  replaceable  with  the  designations  of  specific  entities 
according  to  the  respective  legends)  —  number  variables  or  function  variables  as 
indicated  typographically. 

The  intersection  of  any  two  functions  is  a  function;  e.g.,  that  of  cos  and 
sin  is  the  class  of  all  pairs  ((4n  +  1)^/4,  (—\)n/^/2)  for  any  integer  n. 
The  union  of  cos  and  sin,  however  is  not  a  function.  From  the  set-theo- 
retical point  of  view,  functions  do  not  constitute  a  Boolean  algebra  3.  But 
any  two  functions  have  a  sum,  a  difference,  a  product,  and  a  quotient 

provided  —  is  defined  as  the  class  of  all  pairs  (x,  q)  such  that  (x,  pi)  e/i, 

/2        Pi 
(x,  p2)  e  /2  and  — -  —  q  for  some  pi  and  p2  —  a  definition  that  dispenses 

P2 

with  any  reference  to  zeros  in  the  denominators.  For  instance, 

cot  =  -- —  ,-jr  =  0,  and  -^  ./2  C  fi  for  any  /,  /i,  and  /2. 
tan      0  /2 

Moreover,  any  function  /2  may  be  substituted  into  any  function  /i,  the 
result  /i/2  (denoted  by  mere  juxtaposition,  whereas  multiplication  will 
always  be  denoted  by  a  dot !)  being  the  class  of  all  pairs  (x,  z)  such  that 
(x,  y)  E  /2  and  (y,  z)  e  /i  for  some  y.  The  identity  function,  i.e.,  the  class 
of  all  pairs  (x,  x)  for  any  x  —  an  object  of  paramount  importance  —  will 
be  denoted  4  by  /.  Its  main  property  is  bilateral  neutrality  under  sub- 
stitution: 

(1)  //  =  /  =  // for  any /. 

For  each  /,  there  is  a  bilaterally  inverse  function  5,  Inv  /,  which  is  the 
largest  class  of  pairs  (x,  y)  such  that  (y,  x)  e/  and  that,  under  substitu- 
tion, /  Inv  /  C  j  and  (Inv  /)/  C  /.  For  instance,  Inv  /3  =  /*  and  Inv  exp  = 
log.  If  /+2  is  the  class  of  all  pairs  (x,  x2)  for  any  x  >  0,  then 

Inv  y+2  =  /*,  Inv/*  =  /+2;  similarly,  Inv /_2  =  —  /*,  Inv  —  /*=/_2. 
But  Inv  j2  consists  of  the  single  pair  (0,  0) ;  and  Inv  cos  =  0,  while  Inv  / 

3  For  this  reason,  the  postulational  theories  of  binary  relations,  which  are  based 
on  Boolean  algebra  (cf.  especially,  McKinsoy  [8]  P-  85  and  Tarski  [22]  p.  73),  are 
inapplicable  to  functions. 

4  Cf.  Calculus,  p.  74  and  pp.  99-105.  Cf.  also  Menger  [11]  and  [12]. 

5  Cf.  Calculus,  pp.  91-95,  where  Inv/  is  denoted  by  ////.  The  fertility  of  this 
concept  of  inverse  functions  has  been  brought  out  by  M.  A.  McKierman's  interesting 
and  promising  studies  on  operators.  Cf.  McKiernan  [6],  [7]. 


456  KARL   MENGER 

is  a  branch  of  arccos  if  /  C  cos  and  dom  /  is  the  interval  [nn,  (n  +  \)ri\,  for 
some  integer  n. 

In  the  traditional  literature,  the  identity  function  has  remained  anonymous  — 
one  of  the  symptoms  for  the  neglect  of  substitution  in  analysis.  The  usual 
reference — "  the  function  x"  —  and  the  symbol  x  are  complete  failures  in 
basic  assertions.  Even  in  order  to  assert  substitutive  neutrality,  concisely  ex- 
pressed in  (1),  analysts  are  forced  to  introduce  a  better  symbol  than  x  —  an  ad 
hoc  name  of  the  identity  function,  say  h  —  and  then  must  resort  to  an  awkward 
implication :  If  A(x)  =  x  for  each  x,  then  h(f(x))  =  /(x)  =  /(/&(x))  for  any  / 
and  any  x  G  dom  /. 

The  overemphasis  on  additive-multiplicative  processes,  which  is 
characteristic  of  mathematics  in  the  second  quarter  of  this  century, 
becomes  particularly  striking  in  passing  from  theories  of  functions  based 
on  explicit  definitions  to  postulational  theories  —  theories  of  rings  of 
functions,  of  linear  function  spaces,  etc.,  which  stress  those  properties 
that  functions  or  entities  of  any  kind  share  with  numbers.  One  of  the  few 
exceptions  doing  justice  to  substitution  is  the  trioperational  algebra  of 
analysis  6.  In  it,  functions  are  (undefined)  elements  subject  to  three  (un- 
defined) operations.  With  regard  to  the  first  two,  denoted  by  +  and  . , 
the  elements  constitute  a  ring  including  neutral  elements,  0  and  1.  The 
third,  called  substitution  and  denoted  by  juxtaposition,  is  associative  and 
right-distributive  with  regard  to  the  ring  operations  7  : 

(2)  (/  +  g)h  =  fh  +  gh  and  (f.g)h  =  fh.gh  for  any  /,  g,  h. 

For  many  purposes,  it  is  important  to  postulate  a  neutral  element  / 
satisfying  (1). 

Trioperational  algebra  has  interesting  applications  to  rings  of  poly- 
nomials as  well  as  non-polynomials  8  but  does  not  apply  to  the  realm  of 
all  functions,  evgn  though  the  three  operations  can  be  defined  for  any  two 
functions.  The  only  ring  postulate  that  is  not  generally  satisfied  is  that, 
for  each  g,  there  exist  an  /  such  that  /  +  g  =  0.  For  instance,  —  log  +  log 
is  not  0,  but  rather  the  restriction  of  0  consisting  of  all  (x,  0)  for  x  >  0. 
Only  /  +  log  C  0  has  solutions  (namely,  any  /  C  —  log).  What  narrowed 

6  Cf.  Menger  [13],  [14]. 

7  In  keeping  with  the  traditional  attitude  toward  substitution,  the  laws  (2)  are 
hardly  ever  mentioned  even  though  they  are  as  important  in  analysis  as  is  the 
multiplicative-additive  distributive  law. 

8  Cf.  especially  Milgram  [18]  p.  65,  Heller  [4]  and  Nobauer  [19]. 


AN  AXIOMATIC  THEORY  OF  FUNCTIONS  AND  FLUENTS         457 

the  scope  of  trioperational  algebra  in  its  original  form  was  the  fact  that  it 
did  not  take  the  relation  C  into  account. 

A  more  satisfactory  postulational  approach  to  functions  may  be  based 
on  the  following  idea  of  a  hypergroup:  a  set  ^  satisfying  six  postulates: 

I.  3?  is  partially  ordered  by  a  relation  C.  For  some  purposes  it  is  con- 
venient to  assume  that  ^  includes  a  (necessarily  unique)  minimal  element 
0,  such  that  0  C  y  for  each  y ;  or,  even  further,  that  ^  is  atomized  in  the 
sense  that  (1)  for  each  y  ^  0,  at  least  one  a  C  y  is  an  atom  (i.e.,  such  that 
a'  C  a  if  and  only  if  a'  =  0) ;  (2)  y\  C  y2  if  and  only  if  each  atom  C  yx  is  also 
C  y2.  For  other  purposes,  ^  may  be  assumed  to  be  inter  sectional,  i.e.,  to 
include,  for  any  two  elements  y\  and  y2,  a  maximal  element  C  yi  and 
C  y2  —  an  intersection,  y\  r\  y2. 

II.  In  <& ,  there  is  an  associative  operation,  °. 

III.  ^  includes  a  bilaterally  and  absolutely  neutral  element,  v,  such  that 
y  o  v  —  y  =  v  o  y  for  any  y. 

Clearly,  v  is  unique.  The  connection  between  C,  o,  and  v  is  established 
in  the  following  postulate  that  simplifies  the  author's  original  development 
and  is  due  to  Prof.  A.  Sklar. 

IV.  y  C  6  if  there  exists  an  element  v'  C  v  such  that  v'  °  6  =  y  and  if  and 
only  if  there  exists  an  element  v"  C  v  such  that  y  —  6  °  v". 

It  readily  follows  that  o  is  bilaterally  monotonic ;  that  is  to  say,  yi  C  y2 
implies  y\  °  y  C  y2  °  7  and  7  o  yx  C  y  n  y2  for  any  yi,  y2,  y.  If  there  is  a 
minimal  element,  then  0  maybe  a  bilateral  strict  annihilator'. 

0  o  y  =  0  =  y  o  0  for  each  y. 

Moreover,  if  v\Cv  and  v2  C  v,  then  vi  °  v2  C  vi  and  C  v2 ;  thus,  if  ^  is 
intersectional,  v\  °  v2  C  v\  r\  v2. 

V.  For  each  y,  there  exist  two  unilaterally  and  relatively  neutral  ele- 
ments, Ly  and  Ry  (the  left-neutral  and  the  right-neutral  of  y)  such  that: 

1 )  Ly  o  y  =  y  =  y  °  Ry ; 

2)  L(yi  o  y2)  C  Lyi  and  R(yi  °  y2)  C  Ry2  for  each  yi  and  y2; 

3)  if  ^  C  v,  then  L^w  C  ^  and  R/J  C  ^. 

Clearly,  Ly  C  v  and  Ry  C  v  for  each  y.  If  Ly  =  y  and/or  Ry  =  y,  then 
y  C  v.  If  y  C  v,  then  Ly  =  y  =  Ry.  Hence  LLy  =  RLy  =  Ly  for  every  y. 
Moreover,  Ly  —  0,  Ry  —  0,  and  y  =  0  are  equivalent.  If  y  C  <5,  then 
Ly  C  L(5  and  Ry  C  R^.  If  ^  is  an  annihilator  is  the  sense  that  y  °  #  =  L# 
and  ^  o  y  =  R^  for  each  y,  and  if  %  C  v,  then  #  =  0.  Moreover,  Ly  and  Ry 
are  charaterized  among  the  elements  C  v  by  the  following  minimum 
property: 


458  KARL   MENGER 

If  IJL  C  v,  then  p,  °  y  —  y  implies  Ly  C  ^,  and  y  °  ft  —  y  implies  Ry  C  ^. 
It  follows  that  Ly  and  Ry  are  unique  for  each  y.  If  °  is  commutative,  then 
Ly  —  Ry  for  each  y. 

It  will  suffice,  here,  bypassing  unilaterally  opposite  elements,  to 
postulate  finally 

VI.  For  each  y,  there  is  a  bilaterally  opposite  element  Op  y  such  that 
Op  y  o  y  C  Ry  and  y  °  Op  y  C  Ly  for  each  y, 

and  which,  if  one  sets  Op  y  o  y  —  R'y  and  y  °  Op  y  —  L'y,  has  the  follow- 
ing minimax  property: 

1 )  if  6  o  y  C  Ry  and  y  °  (5  C  Ly,  then  6  °  y  C  R'y  and  y  °  (5  C  L'y ; 

2)  if  d  o  y  =  R'y  and  y  °  d  =  L'y,  then  Op  y  C  (5. 

Op  y  is  unique  for  each  y,  and  L'y  o  y  =  y  0  R'y,  which  might  be  called 
Cy,  the  core  of  y.  If  fj,  C  v,  then  Op  /u  =  R'/j  =  L'/j  =  C//  =  ^.  For  each 
atom,  Op  a  is  an  atom,  and  Ca  =  a.  Additional  assumptions  would 
guarantee  that 

Op  6  o  Op  y  C  Op(y  °  6) ;  Op  Op  y  C  y ;  Op  Op  Op  y  =  Op  y  for  each 
y,  (5.  However,  y  C  6  does  not  imply  Op  y  C  Op  d. 

An  element  y  of  a  hypergroup  will  be  called  right-elementary  (or  left- 
elementary)  if  d  C  y  implies  Rd  C  Ry  (or  Ld  C  Ly).  Each  atom  is  bilater- 
ally elementary  —  briefly,  elementary.  If  ^  is  commutative  and  y  is 
unilaterally  elementary,  y  is  elementary.  If  ^  is  atomized,  then  Q  is 
right-elementary  if  and  only  if  Q  *  IJL  =  Q  implies  ju,Cv.  If,  in  contrast, 
x  o  y  C  ^  for  each  y,  then  /*  may  be  called  a  left-annihilator ;  and  each  */ 
that  is  C  x,  a  leftquasiannihilator.  Clearly,  x'  °  y  and  y  °  x;  are  left- 
quasi-annihilators  for  any  y.  If  each  element  of  ^  is  right-elemen- 
tary (or  elementary),  then  ^  will  be  said  to  be  right-elementary  9  (or 
elementary). 

With  regard  to  addition  as  well  as  multiplication,  the  set  of  all  functions 
is  a  commutative  elementary  hypergroup.  The  universal  neutrals  are  0 
and  1\  the  relative  neutrals  of  /  are,  as  it  were,  vertical  projections  of /on 

0  and  1,  respectively;  the  opposites of  / are  —  /  and  —  . 

With  regard  to  substitution,  the  set  of  all  functions  is  a  (non-commu- 
tative) right-elementary  hypergroup.  The  universal  neutral  is  /.  The 
relative  neutrals,  R/  and  L/,  correspond  to  dom  /  and  ran  /,  respective- 

9  Prof.  B.  Schweizer  proposes  to  call  d  a  right-neutralizer  ofy  if  yodCv  and 
Ld  C  Ry,  and  to  say  that  d  is  ( 1 )  maximal  if  yo  8'  C  yo  6  for  each  right-neutralizer  6' 
(2)  saturating  if  yod  =  Ly.  One  might  then  postulate  that  each  clement  y,  on  either 
side, 'has  at  least  one  maximal  neutralizer  or  at  least  one  saturating  neutralizcr. 


AN  AXIOMATIC  THEORY  OF  FUNCTIONS  AND  FLUENTS        459 

ly. 10  Thus  the  contrast  between  functions  (classes  of  pairs  of  numbers)  and 
their  domains  and  ranges  (classes  of  numbers)  disappears.  Op  /  is  Inv  /. 
The  left  annihilators  ^  0  are  what  may  be  called  universal  constant 
functions]  the  left-quasiannihilators  ^  0  are  the  constant  functions  n. 

Another  example  of  a  hypergroup  is  the  set  of  all  binary  relations  in 
some  universal  set  with  regard  to  what  logicians  call  the  relative  pro- 
duct 12.  The  universal  neutral  is  the  identity  relation,  while  the  relative 
neutrals  again  correspond  to  domains  and  ranges.  Op  y  is  a  restriction  — 
in  general,  a  proper  restriction  —  of  the  converse  of  the  relation  y. 

Geometrically,  the  situation  may  be  interpreted  in  a  set  (a  "plane" 
consisting  of  "points"}  that  is  decomposed  into  mutually  disjoint  subsets 
('vertical  lines"}.  "Simple"  sets,  i.e.,  sets  having  at  most  one  point  in 
common  with  each  vertical  line,  are  the  counterpart  of  functions.  This 
vertical  simplicity  corresponds  to  right-side  elementariness  of  functions. 
Substitution  can  be  illustrated  if,  secondly,  the  plane  is  decomposed  into 
disjoint  subsets  ("horizontal  lines"}  each  of  which  has  exactly  one  point  in 
common  with  each  vertical  line;  and  if,  thirdly,  there  is  given  a  "diagonal" 
set  having  exactly  one  point  in  common  with  each  vertical  line  as  well  as 
with  each  horizontal  line.  The  diagonal  corresponds  to  / ;  each  horizontal 
line,  to  a  universal  constant  function;  the  vertical  (the  horizontal) 
projection  ot  a  simple  set  /  on  the  diagonal,  to  R/  (to  L/) ;  the  points,  to 
atoms.  Fig.l,  p.  460,  based  on  the  assumption  of  ordinary  vertical  and 
horizontal  lines  and  a  straight  diagonal,  /,  shows  a  simple  plane  con- 
struction 13  of  the  result  of  substituting  g  into  /.  For  any  point  a  in  the  set 
g,  move  horizontally  to  /,  then  vertically  to  /,  and  finally  horizontally 
back  to  the  vertical  line  through  a.  The  set  of  all  points  thus  obtained  is 

10  In  contrast  to  groups  and  hypergroups,  a  Brandt  groupoid  (Mathematische 
Annalcn,  vol.  96)  only  permits  the  composition  of  some  elements.  In  ,, categories" 
(i.e.,  essentially,  groupoids)  of  mappings  of  groups  on  groups,  MacLane  calls  the 
one-side  identities  of  a  mapping  its  domain  and  range. 

11  In  a  self-explanatory  way,  one  can  say  that  functions  constitute  a  commu- 
tative elementary  hyperfield  with  regard  to  addition  and  multiplication  (with  the 
multiplicative  annihilator  0)  and  (non-commutative)  right-elementary  hyperfields 
with  regard  to  addition  and  substitution  as  well  as  multiplication  and  substitution. 
The  functions  may  also  be  said  to  constitute  a  trioperational  hyperalgebra. 

12  Cf.,  e.g.,  McKinsey  [8]  and  Tarski  [22]. 

13  Cf.  Calculus  pp.  89  ff .  The  traditional  postulational  theory  of  binary  relations 
is  inapplicable  to  functions  (cf.  3).  On  the  other  hand,  the  plane  construction  of 
functional  substitution  here  described  may,  as  Prof.  M.  A.  McKiernan  observed,  be 
utilized  for  binary  relations  instead  of  the  3-dimensional  construction  proposed  by 
Tarski  [22]  pp.  78,  79. 


460 


KARL   MENGER 


/g.  In  the  figure,  /  has  the  shape  of  an  exponential  curve ;  g,  that  of  —  /2 ; 
hence  /g,  that  of  the  probability  curve. 

Notwithstanding  the  analogy  (brought  out  in  the  concept  of  a  hyper- 
group)  of  addition  and  multiplication  with  substitution,  the  latter  has  a 
definite  primacy.  In  an  atomized  non-commutative  hypergroup  &,  any 
binary  operation  x  (such  as  +  and  .),  defined  in  the  class  of  all  atoms 
C  v,  may  be  extended  to  any  two  elements  y'  and  y"  of  ^  by  defining 
y'  X  y"  as  the  minimum  element  including  all  atoms  a  such  that  there 


Fig.  1 

exist  two  atoms  a'  C  y'  and  aCy"  satisfying  the  following  conditions: 
Ra  =  Ra'  =  Ra"  and  La  =  La'  X  La". 

(This  is  essentially  how  the  arithmetical  operations  are  extended  from 
numbers  to  functions.)  Moreover,  Neg  /  and  Rec  /,  the  negative  and  the 
reciprocal  of/,  are  obtainable  by  substituting  /  into  —  /  and  j~l,  respective- 
ly; and,  as  will  be  shown  presently,  even  /  +  g,  /.g,  and  —  can  be  ob- 

/2 

tained  from  2-place  functions  S,  P,  and  Q  by  substitution.  In  contrast, 

14  In  fact,  no  2-place  function  yields  fg  even  by  substitution  of  /  and  g.  Cf. 
Calculus,  p.  304. 


AN  AXIOMATIC  THEORY  OF  FUNCTIONS  AND  FLUENTS        461 

there  are  no  functions  of  any  kind  which,  for  each  /  and  g,  would  yield 
fg  or  even  Inv  /  by  additions  and  multiplications. 14  Beyond  any  question, 
in  the  realm  of  functions  substitution  is  the  operation  par  excellence. 

For  any  integer  m  ^  1 ,  a  class  of  consistent  pairs  whose  second  mem- 
bers are  real  numbers  while  their  first  members  are  sequences  of  m  real 
numbers  is  called,  briefly,  an  m-place  function  15  and  will  be  designated  by 
a  capital  italic,  except  where  lower  case  italics  emphasize  the  1 -place 
character  of  functions  such  as  those  treated  in  what  precedes.  The  class  Q 

of  the  pairs  ( (xi,  X2), )  for  all  xi  and  X2  7^  0  is  a  2-place  function.  So 

\  X2  / 

are  sum  and  product,  S  and  P,  from  which,  because  of  their  associativity, 
an  m-place  sum  and  product,  5m  and  Pmt  can  be  derived  for  each  m  >  2. 
Of  particular  importance  is,  for  any  two  integers  1  <  i  <  k,  the  i-th 
k-place  selector  function  /$*,  that  is,  the  class  of  all  pairs  ((xi,  . . . ,  x#),  Xf) 
for  any  xi,  .  .  . ,  x#.  Clearly,  /i1  =  /. 

There  are  two  main  types  of  substitution  of  sequences  of  m  functions 
into  an  m-place  function  (which,  for  m  =  1,  coincide  with  one  another 
and  with  the  substitution  as  defined  on  p.  455) : 

a)  product  substitution:  Fm[Gi,  . . .,  Gw],  whose  domain  is  a  subset  of 
the  Cartesian  product,  dom  GI  X  ...  X  dom  Gm,  which  is  the  class  of  all 
sequences  (yi,  . . . ,  ym)  for  any  yi  e  dom  GI,  . . . ,  ym  e  dom  Gm. 

b)  intersection     substitution:     Fm(G\,  . . .,  Gm),     whose     domain     is 
C  dom  GI  r»  .  . .  n  dom  Gm.  Unless  GI,    . . . ,  Gm  have  the  same  place- 
number,    that    intersection    is   empty    and   Fm(Gi,  . . . ,  Gm)  =  0;    e.g., 
P(jt  S)  =  0,    while    P(/i2, 5)    and    P(/22,  S)    are    non-empty.    Clearly, 

Q(fi,  /2)  =  — ,  as  defined  on  p.  455;  and  Iim(G\,  . . .,  Gm)  =  G$  for  any 

72 

m  functions  of  the  same  placenumber.  A  simple  generalization  of  the  plane 
construction  described  on  p.  459  to  the  3-dimensional  space  16  yields 
F*(Gi*,  G22). 

Traditionally,  P(/2,  log),  P[/2,  tog]f  P(P)  S),  P[P,  S],  P(/22,  5),  P[/2,  S] 
are  referred  to  as  the  functions  x2  log  x,  x2  log  y,  xy(x  +  y),  xy(u  +  v), 
y(x  +  y),  x2(y  +  z),  respectively. 

Either  type  of  substitution  can  be  extended  to  a  realm  of  sequences 

15  One  might  introduce  numbers  as  0-plane  functions. 

16  Cf.  Menger  [15]  p.  224.  Recently,  S.  Penner  in  his  Master's  thesis  at  Illinois 
Institute  of  Technology  has  extended  the  geometric  axiomatics  of  substitution, 
outlined  on  p.  459  of  the  present  paper,  from  1 -place  to  m-place  functions  in  the 
m  -f-  1 -dimensional  space. 


462  KARL    MENGER 

of  functions.  With  each  sequence,  besides  the  number  of  functions 
in  it,  called  the  sequence-number,  a  placenumber  will  be  associated. 
Either  substitution  of  a  second  sequence  into  a  first  presupposes  that 
the  sequence-number  of  the  second  be  equal  to  the  place-number  of  the  first. 

a)  An  s-place  array  of  r  functions  is  a  sequence  such  that  the  sum  of 
the  r  places-numbers  is  s;  for  instance  Ors  =  [Fiwi,  . . . ,  Frmr],  where 
s  =  mi  +  ...  +  mr.  Product  substitution,  defined  by 

Or-1V  =  [Fi«i[Gi,  . . .,  CmJ,  . . .,  Fr^Gs-m^+i,  . . .  GJ], 
clearly  is  associative  and  admits  unilateral  neutrals : 

Or»pF,«X,«]  -  [<VT/]X^  and  |/<V  =  <V  -  0rV, 

where,  for  any  k  >  1,  \kk  =  [/]*,  an  array  of  k  functions  /. 

b)  An  s-place  throw  of  r  functions  is  a  sequence  such  that  all  r  place 
numbers   are   s;   for  instance,    Frs  =  (Fis,  .  .  . ,  Frs) .   Intersection   sub- 
stitution, defined  by 

FfGf  =  (*V(GV,  ....  GJ) Fr*(GS,  ...,  GJ}), 

is  associative  and  admits  unilateral  neutrals.  Let  Ijf  be  the  k-place  throw 
of  all  k-place  selector  functions  in  the  natural  order,  and  let  (/#*) ^denote 
the  k-placc  throw  of  hk  functions  forming  a  chain  of  h  throws  Ikk.  Then 

Fr'(G8*Ht*)  =  (FrsGs')Htu  and  IrrFr8  =  Frs  =  Frslss. 

By  mean  sof  intersection  substitution,  the  array  Orrs  and  the  throw  Frs 
with  the  same  components  FI*,  .  .  . ,  Frs  can  be  connected :  Frs=<brrs(Iss)r. 

Commutativity  and  associativity  of  addition  and  the  distributive  law 
can  be  expressed  in  the  formulae : 

S  =  S(/a»,  /!«) ;  S[j,  S]  =  S[S,  /] ;  P[S,  j]  =  S[P,  FK/A  /33,  /23,  /33). 

The  existence  of  right-neutralsK  has  the  following  simple 

Corollary.  Every  non-empty  function  of  any  number  of  places  lends 
itself  to  substitutions  (of  both  types)  with  non-empty  results. 

For  any  k  >  1,  the  k-place  throws  of  k  functions  form  a  hypergroup 
by  intersection  substitution.  More  generally,  throws  as  well  as  arrays  of 
functions  constitute  what  might  be  called  hypergroupoids  —  a  concept 
that  will  be  studied  elsewhere. 

Both  types  ol  substitution  can  be  extended  to  n-ary  relations.  For  in- 
stance, if  P  is  a  class  of  sequences  of  n  +  1  elements;  and  if  HI,  .  .  . ,  IIn 
are  classes  of  (not  necessarily  consistent)  ordered  pairs,  then  P[IIi,  . . ., 
FU]  will  denote  the  class  of  all  sequences  (ai,  . . .,  a»,  y)  such  that  for 
some./?i,  . . .,  f}n: 


AN  AXIOMATIX  THEORY  OF  FUNCTIONS  AND  FLUENTS        463 

(ai,  Pi)  e  Hi,  ....  («„  /»,)  e  nn  and  (/?! /».,  y)  6  P. 

In  what  precedes,  only  raf/  functions  have  been  considered,  but  all  statements 
(including  the  following  remarks)  remain  valid  if  one  selects  a  ring  (or,  where 
division  is  involved,  a  field)  and  writes  element  of  the  ring  (the  field)  instead  of 
real  number. 

The  definitions  of  arithmetical  operations  for  functions  (addition,  etc.) 
merely  presuppose  classes  of  consistent  pairs  whose  second  members  are 
real  numbers.  The  nature  of  the  first  members  plays  no  role.  Operating  on 
functions  with  disjoint  domains,  however,  yields  0;  for  instance,  /-2+ 
log  =  0  and  / .  5  =  0.  Hence,  for  some  results  in  a  class  of  functions  to  be 
non-empty,  it  is  necessary  that  some  domains  be  non-disjoint 17.  With  this 
proviso,  the  arithmetical  operations  may  be  extended  to  what  I  will  call 
functors— classes  of  consistent  quantities,  if  quantity  is  any  ordered  pair 
whose  second  member  is  a  number  18.  Of  course  only  functors  whose 
domains  consist  of  mathematical  entities  are  objects  of  pure  mathe- 
matics. Mathematical  functors  that  arc  not  functions  have  been  called 
functionals ;  e.g.,  the  class/(J  of  all  pairs  (/,  /J  /)  for  any  integrable  function  /. 

Substitution  presents  an  altogether  different  situation.  If  the  result 
/i/2  is  non-empty  it  is  so  because  the  first  member  of  the  pair  (y,  z)  e  /i  is 
the  second  member  in  a  pair  (x,  y)  e  /2 ;  in  other  words,  because  functions 
are  classes  ol  pairs  whose  first  and  second  members  are  of  like  nature  19.  A 
similar  reason  accounts  for  substitutions  of  sequences  of  functions  into 
functions  of  several  places.  In  view  of  the  corollary  on  p.  462,  the  only 
junctors  that  lend  themselves  to  substitution  with  some  non-empty  results  are 
the  f^lnct^ons.  Calling  every  class  of  consistent  quantities  a  "function" 
(which  has  been  proposed)  thus  epitomizes  overemphasis  on  addition  and 
multiplication  as  well  as  supreme  disregard  for  the  paramount  operation 
in  the  realm  of  functions  —  substitution. 

II.    FLUENTS 

The  objects  of  science  and  geometry  to  which  Newton  referred  as  fluents 
and  which  he  and  his  successors  have  treated  with  supreme  virtuosity 

17  Functions  of  the  same  place-number,  and  even  throws,  satisfy  this  condition, 
and  actually  lend  themselves  to  meaningful  addition  and  multiplication. 

is  Cf.  Calculus,  Chapter  VII. 

19  What  that  common  nature  of  the  elements  is  plays  no  role  in  the  definition  of 
substitution.  For  any  set  5,  one  may  consider  classes  of  consistent  pairs  of  elements 
of  S  (self -mappings  of  5)  and  define  substitution.  Examples  include  w-place  throws 
of  n  functions. 


464  KARL   MENGER 

have  not,  in  the  classical  literature,  ever  been  defined  either  explicitly  or, 
by  postulates,  implicitly.  There  are  of  course  scientific  procedures  de- 
termining, for  instance,  pyo,  the  gas  pressure  in  atm.  of  a  specific  in- 
stantaneous gas  sample  yo,  corresponding  to  arithmetical  definitions  of 
log  2.  But  the  function  log  (even  though  its  definition  on  p.  454  presup- 
poses the  understanding  of  log  x  for  any  x)  must  be  distinguished  from  the 
numbers  log  x  as  well  as  from  the  class  ran  log.  Similarly,  p  —  in  the 
sequel,  fluents  as  well  as  1 -place  functions  will  be  designated  by  lower 
case  italics  —  must  be  distinguished  from  the  numbers  py  as  well  as  from 
ran  p  (the  class  of  all  those  numbers) .  The  fluent  p  is  the  class  of  all  pairs 
(y,  py}  for  any  instantaneous  gas  sample  y. 

Besides  this  (as  it  were,  objective)  pressure  p,  there  is,  for  any  ob- 
server A,  a  fluent  pA,  the  gas  pressure  in  atm.  observed  by  A,  which  is  the 
class  of  all  pairs  (a,  />A«)  for  any  act  a  of  A's  reading  a  pressure  gauge 
calibrated  in  atm.,  where  p\ct.  denotes  the  number  —  the  pure  number,  say, 
1.5  —  read  by  A  as  the  result  of  a. 

Thus  extramathematical  features  (such  as  "denomination"  and  "dimension") 
that  are  often  attributed  to  the  values  of  p  and  />A  are,  as  it  were,  absorbed 
in  the  definitions  of  these  fluents.  Their  values  being  pure  numbers,  also 
ran  p  and  ran  p&  are  objects  of  pure  mathematics.  In  contrast,  dom  p  and 
dom  p\  and,  therefore,  p  and  p\  themselves  arc  extramathematical  objects.  The 
definition  of  cin  entire  fluent  adds  to  the  knowledge  of  its  values  the  idea  of  a 
class  —  a  class  that  is  highly  significant  in  some  physical  laws  and,  in  fact, 
indispensable  if  intuitive  understanding  (however  efficient)  of  those  laws  is  to 
crystallize  in  articulate  formulations. 

Differentiation  between  p,  on  the  one  hand,  and  the  numbers  p  y  or  the 
class  ran  p,  on  the  other,  however  slight  the  difference  may  appear,  is  at 
variance  with  the  entire  traditional  literature  on  fluents  inasmuch  as  the 
latter  is  at  all  articulate.  McKinsey,  Sugar,  and  Suppes  20  introduce  time 
as  a  class  of  numbers  (clock  readings)  and  Artin  21  takes  a  similar  position 
(whereas,  from  the  point  of  view  here  expounded,  t^,  for  an  observer  A, 
is  the  class  of  all  pairs  (r,  IAT)  for  any  act  T  of  clock  reading  performed  by 
A).  Courant  says  explicitly  22  that  Boyle's  law  deals  only  with  the  values 
ol  p  and  v  and  not  with  those  quantities  themselves.  All  that  physics 
supplies,  he  emphasizes,  are  the  classes  of  values  of  p  and  v. 

In  fact,  Courant  mentions  p  as  an  example  of  a  variable  (a  symbol  that 

20  Cf.  McKinsey,  Sugar  and  Suppes  [9]. 

21  Cf.  Artin  [\i  p.  70. 

22  Cf.  Courant  [3],  p.  16. 


AN  AXIOMATIC  THEORY  OF  FUNCTIONS  AND  FLUENTS        465 

may  be  replaced  with  the  designation  of  any  element  of  a  class  of  num- 
bers), thereby  illustrating  another  error  pervading  the  traditional 
literature :  the  identification  of  fluents  with  what  herein  is  called  number 
variables,  and  the  indiscriminate  use  of  the  term  variable  as  well  as  the 
same  (italic)  type  for  both. 

Yet  —  and  this  is  a  mere  hint  of  the  actual  gulf  separating  the  two  —  number 
variables  may  be  interchanged,  whereas  fluents  (e.g.,  abscissa  and  ordinate 
along  a  curve  in  a  Cartesian  plane,  x  being  the  class  of  all  pairs  (n,  xn)  for  any 
point  n  on  the  curve)  must  not.  For  instance,  the  class  of  all  (x,  y)  such  that 
y  =  x2  is  the  same  as  the  class  of  all  (y,  x)  such  that  x  =  y2,  whereas  the 
parabola  y  =  x*  and  the  parabola  x  =  y2  are  different  curves. 
The  confusion  is  enhanced  by  the  use  of  the  term  variable,  thirdly,  for  symbols 
that  are  replaceable  with  the  designations  of  any  element  of  some  well-defined 
class  of  fluents  or  of  classes  of  consistent  quantities  —  in  other  words,  for  fluent 

variables  or  c.c.q.  variables;  e.g.,  for  u  in  the  statement  — =  cos  u  for  any 

du 

c.c.q.  u  that  is  continuous  on  (the  limit  class)  dom  u.  Here,  u  may  be  replaced 
with  the  designation  of  the  time  23  or  the  abscissa  or  even  a  continuous 
functional  (as  is  /Ol  in  the  realm  of  continuous  functions  whose  limit  is  defined 
by  uniform  convergence),  but  nor  with  the  designation  of  a  number.  One  has 

d  sin  t                  d  sin  x                                   d  sin  [^  y.1 
=  cos  t, =  cos  x  and  even ~—  —  cos  f£, 

dsin\ 

whereas =  cos  1  is  nonsense. 

d\ 

The  literature  also  contains  allusions  to  fluents  that  avoid  confusing 
them  with  either  classes  of  numbers  or  number  variables.  But  those 
allusions  (usually  to  "variable  numbers")  are  inarticulate  beyond  re- 
cognition. For  instance  Russell  24,  Tarski  25,  and  other  logicians  in  dis- 
cussing number  variables  have  repeatedly  criticized  the  misconception 
of  numbers  that  are  capable  of  various  values;  and  indeed,  there  are  no 
numbers  that  are  both  0  and  1 ,  nor,  as  some  one  put  it,  numbers  that  have 
different  values  on  weekdays  and  on  Sundays.  What  logicians  seem  to 
overlook,  however,  is  the  fact  that  many  obscure  allusions  to  "variable 
numbers"  do  not  refer  to  number  variables  in  the  logico-mathematical 

23  Strictly  speaking,  the  domain  of  a  fluent  is  not  a  limit  class.  In  a  model, 
however,  according  to  the  concluding  remarks  of  the  present  paper,  /  and  s  may  be 
assumed  to  be  continuous  classes  of  consistent  quantities  on  domains  that  are  limit 
classes.  Cf.  Calculus,  pp.  220-225. 

24  Cf.  Russell  [20],  p.  90. 

25  Cf.  Tarski  [21],  pp.  3,  4. 


466  KARL   MENGER 

sense,  but  rather  represent  utterly  confused  references  to  Newton's 
fluents.  A  fluent  (without  of  course  being  a  variable  number)  may  indeed 
assume  both  the  value  1  and  the  value  0.  In  fact,  it  may  (as  does,  e.g.,  the 
admission  fee  in  $  to  certain  art  galleries)  assume  the  value  1  on  weekdays 
and  the  value  0  on  Sundays. 

In  the  broadest  sense,  a  fluent  may  be  defined  as  a  class  of  consistent 
quantities  with  an  extramathematical  domain  —  the  c.'s  c.q.  with  mathe- 
matical domains  being  functions  and  functionals.  Fluents  such  as  the 
class  h  of  all  pairs  (F,  h¥)  for  any  Frenchman  F,  where  hF  is  F's  height  in 
cm.  (studied  in  biology  and  sociology),  are  sometimes  called  variates] 
their  domains,  populations. 

Clearly,  not  every  quantity,  as  defined  on  p.  463,  is  interesting',  nor  is  every 
fluent  significant,  even  if  its  elements  are  interesting  quantities  —  think  of  the 
union  of  the  height  in  the  population  of  France  and  the  weight  in  the  population 
of  Italy.  Nor,  for  that  matter,  is  every  function  and  every  functional  important. 
While  the  general  theory,  of  course,  provides  the  scheme  for  handling  all 
fluents,  it  is  up  to  the  individual  investigator  to  apply  it  to  some  of  the  countless 
cases  that  are  theoretically  or  practically  significant. 

Some  critics  of  the  theory  here  expounded  have  suggested  that  its  basic  idea, 
the  concept  of  fluent,  has  always  been  known,  viz.,  under  the  name  of  "real 
function"  and,  moreover,  follows  the  pattern  of  Kolmogoroff's  well-established 
concept  of  random  variables  —  r.v.'s.  Besides  overextcnding  the  use  of  the 
term  function  (see  p.  463),  those  critics  seem  to  overlook:  (1)  that  what  is 
essential  in  the  theory  is  the  f cumulation  of  definitions  for  the  (heretofore 
only  intuitively  used)  concepts  that  Newton  called  fluents  —  definitions  that 
are  at  variance  with  their  traditional  treatment,  which  ignores  classes  of  pairs 
altogether  (see  p.  464);  (2)  that  scientific  fluents  and  r.v.'s  lack  one  another's 
very  characteristics  and  are,  if  anything,  complementary  rather  than  parallel 
concepts.  26  If  /I  is  a  physical  die,  then  the  (extramathematical]  class  tA  of  all 
pairs  (d,  td]  for  any  act  d  of  rolling  J  is  an  experimental  fluent  but  not  a  Y.V.  —  not 
even  if  an  additive  functional  ("probability")  is  defined  for  the  26  subsets  of 
ran  tA  =  {1,  .  .  .,  6}  (i.e.,  the  class  5  of  all  possible  outcomes  of  rolling  A).  On 
the  other  hand,  in  presence  of  such  a  probability  functional  on  the  subsets  of 
S,  any  (purely  mathematical]  function  having  S  as  domain  is  a  r.v.  but  not  a 
scientific  fluent;  e.g.,  the  function  /  for  which  /(I)  =  v/ 7,  /(2)  =  n  -{-  e, 
/(6)  =  cos  2  -\-  log  5.  By  their  definitions,  r.v.'s  lack  connections  with  ex- 
periment and  observation.  Again,  scientific  fluents  such  as  tA,  gas  pressure, 
and  time  lack  the  characteristic  of  r.v.'s,  since  the  definition  of  a  reasonable 
probability  on  subsets  of  their  domains  is  completely  out  of  the  question. 
(What  should  be  the  probability  of  an  act  of  rolling  A,  or  of  a  gas  sample  or  of 
an  act  of  clock  reading  ?  Only  in  the  range  of  a  scientific  fluent  can  one  define 
frequency,  relative  frequency  and,  perhaps,  probability.)  (3)  That  even  some- 

26  Cf.  Menger  [17],  pp.  222-223. 


AN  AXIOMATIC  THEORY  OF  FUNCTIONS  AND  FLUENTS        467 

one  referring  to  all  functors  as  "functions"  cannot  escape  the  use  of  a  special 
term  (say,  "functions  in  the  strict  sense")  referring  to  those  functors  whose 
domains  consist  of  numbers  or  sequences  of  numbers.  For  (because  of  their 
substitutive  properties,  not  shared  by  any  other  functors)  these  functors  play  a 
special  role,  and  therefore  are  omnipresent  in  science  as  well  as  in  mathematics. 
While  in  the  light  of  the  conceptual  clarifications,  terminological  questions  are 
quite  insignificant,  it  does  seem  most  appropriate  to  call  fluents  what  Newton 
called  fluents,  and  functions,  what  Leibniz  called  functions. 

The  union  of  non-identical  fluents  with  the  same  domain  is  not  a  fluent. 
From  a  set-theoretical  point  of  view,  fluents  do  not  constitute  a  Boolean 
algebra.  One  of  the  few  positive  formal  properties  of  fluents  is  the  possi- 
bility of  substituting  them  into  1 -place  functions:  log p  is  the  class  of 
all  pairs  (y,  log  py)  for  any  sample  y  —  a  definition  analogous  to  that  of 
log  cos.  But  while  also  the  cosine  of  the  logarithm  is  a  c.c.q.  ^  0t  the 
pressure  of  the  logarithm  is  empty.  Every  function  permits  some  non- 
empty substitutions,  whereas  a  fluent  (like  a  functional)  permits  none. 

Attempts  have  been  made  to  dodge  the  problem  of  articulately  connecting 
various  fluents  by  defining  some  of  them  as  functions  of  others  27.  Yet,  even 
if  in  a  gallery  a  sign  declares  that  admission  costs  %  1  on  weekdays  and  is  free 
on  Sundays,  the  concept  of  admission  fee  cannot  very  well  be  said  to  be  de- 
fined as  a  function  of  the  time.  Someone  unfamiliar  with  that  concept  will  not 
grasp  it  by  reading  the  sign  while,  on  the  other  hand,  the  concept  is  compre- 
hensible to  persons  ignorant  of  the  days  of  the  week.  Actually,  admission  fee 
might  (for  operative  purposes)  be  defined  as  the  class  a  of  all  pairs  (A,  aA)  for 
any  act  of  admitting  a  visitor,  where  a\  is  the  amount  in  $  charged  during  A. 
The  sign,  comprehensible  only  to  those  who  know  a  and  t,  stipulates  how  the 
two  are  connected. 

By  substitutions  into  2-place  functions  5,  P,  etc.,  significant  addition, 
multiplication  etc.  of  fluents  can  be  defined,  provided  that  their  domains 
are  non-disjoint  —  the  only  condition  for  arithmetical  operations  on 
c.'sc.q.  to  be  non-empty  (see  p.  463).  For  instance,  P(p,  v),  the  result  of 
intersection  substitution  of  p  and  v  (whose  common  domain  is  the  class 
of  all  y)  is  p.v.  But  a  slight  change  in  the  point  of  view  raises  difficulties. 
What,  in  view  of  the  fact  that  dom  pA  and  dom  VA  consists  of  acts  of 
different  (manometric  and  volumetric)  observations,  is  the  meaning  of 
PA-VA?  Since  Boyle,  it  has  become  traditional  to  associate  with  that 
symbol  (if  only  intuitively,  i.e.,  without  explicit  definitions)  the  class 
((n,  ft),  p  AM -V  Aft)  for  any  two  simultaneous  acts  n  and  ft  that  A  directs  to 

27  Cf.  the  references  in  footnotes  20  and  21. 


468  KARL   MENGER 

the  same  object;  thus  />A-^A  denotes  the  restriction  of  P[pA,  *>A]  to  the 
class  .Tof  all  pairs  of  simultaneous  and  co-objective  acts  e  dom  p&  x  dom  v&> 
It  thus  appears  that  in  operating  on  fluents,  besides  referring  to  the  ele- 
ments of  their  several  domains,  one  may  well  have  to  relativize  the  oper- 
ations to  certain  pairings  of  those  domains.  Such  relativizations  are 
imperative  in  formulating  —  articulately  formulating  —  relations  be- 
tween fluents. 

III.  RELATIVE  CONNECTIONS  OF  FLUENTS  BY  FUNCTIONS 

Consider  Boyle's  law  for  gas  undergoing  an  isothermal  process  —  in 

proper  units,  v  =  — .  If  all  that  physics  supplied  were  the  values  of  v  and 

P 

p  or  the  classes  of  those  values,  then  Boyle  might  have  discovered  his 
law  upon  being  presented  with  a  bag  containing  cards  each  indicating  a 
value  of  p,  and  another  bag  informing  him  of  the  values  of  v.  But  why,  in 
that  situation,  should  Boyle  have  paired  each  number  in  the  first  bag  just 
with  its  reciprocal  in  the  second  rather  than,  say,  with  its  square  root? 
As  a  matter  of  fact,  Boyle  did  not  primarily  pair  numbers  at  all.  Pairing 
numbers  is  what  mathematicians  do  in  defining  functions.  What  Boyle 
actually  paired  were  observations  pertaining  to  the  same  object;  and  he 
discovered  that 

(3)  wy  = for  any  inst.  gas  sample  y  at  the  fixed  temperature. 

Pr 

This  statement  is  comparable  to 

(4)  cot  x  = for  any  number  x  that  is  not  a  multiple  of  n/2, 

with  v  and  p  corresponding  to  cot  and  tan\  and  the  sample  variable  y,  to 
the  number  variable  x. 

Unfortunately,  the  classical  literature  has  done  all  that  was  possible  to  conceal 
the  existing  analogies.  Besides,  it  has  simulated  a  parallelism  between  v,  p  and 
x  by  indiscriminately  referring  to  them  as  "variables"  and  using  the  same 
(italic)  type  for  all  of  them  (whereas  the  functions  are  usually  denoted  by  cot 
and  tan).  In  an  attempt  to  mask  the  confusion  between  fluents  and  number 
variables,  a  contradiction  in  terms  comparable  to  "enslaved  freeman"  was 
coined:  "dependent  variable".  Finally,  the  true  analogues  of  v  =  \/p,  formulae 
such  as  cot  =  \jtan  (connecting  two  functions  just  as  Boyle's  law  connects  two 


AN  AXIOMATIC  THEORY  OF  FUNCTIONS  AND  FLUENTS        469 

fluents)  are  anathema,  and  only  the  corresponding  statements  about  numbers, 
such  as  (4)  are  admitted.  28 

For  an  observer  A,  Boyle's  law  takes  the  form 
(5)  VAJ#  —  -  for  any  two  acts  (n,  /?)  e  F. 


Relativizing  connections  of  two  fluents  to  a  class  F  of  pairs  of  simultane- 
ous co-objective  acts  is  very  natural  though  not  logically  cogent.  At  any 
rate,  since  Galileo  and  Boyle,  such  (tacitly  understood)  relativizations 
have  become  second  nature  to  physicists,  who  have  transplanted  them, 
as  matters  of  course,  even  to  quantum  mechanics  —  a  field  where  they 
are  rather  problematic.  In  v  =  \/p,  the  pairing  is  altogether  hidden. 

On  the  level  of  general  statements  about  fluents,  however,  the  need  for 
explicit  relativizations  is  evident.  The  question  "Is  w  =  \/u?"  for  any 
two  fluents  is  incomplete.  Certainly  it  does  not  necessarily  refer  to  the 
entire  class  dom  u  x  dom  w  ;  that  is  to  say,  it  does  not  necessarily  mean 
"Is  each  value  of  w  the  reciprocal  of  each  value  of  w?"  In  this  sense,  for  an 
affirmative  answer  it  would  be  necessary  that  both  u  and  w  were  constant 
fluents.  The  question  thus  must  refer  to  some  subset  of  dom  u  x  dom  w. 
But  to  which  subset  ?  No  particular  subset  of  the  Cartesian  product  of  any 
two  (especially  disjoint)  sets  is  or  can  be  "natural".  The  intended  subset 
must  be  specified.  Such  a  relativization  is  necessary  in  order  to  make  the 
question  complete. 

In  the  broadest  sense,  the  connection  of  a  class  of  consistent  quantities 
w  with  another  c.c.q.  v  relative  to  a  set  II  C  dom  u  X  dom  w  by  the  func- 
tion /  is  described  in  the  following  basic  definition: 

w  =  /«(rel.  II)  if  and  only  if  (a,  ft)  E  U  implies  wfi  =  fuat,. 

Here,  the  consequent  might  be  replaced  with  :  (HOC,  wfi)  e  /.  For  instance, 
(3)  results  if  II  is  the  class  I  of  all  pairs  (y  ,  y)  .  The  connection  of  functions 
by  functions  in  traditional  analysis  is  relative  to  restrictions  of  j.  If  /'  is 
the  restriction  of  /  to  numbers  that  are  not  multiples  of  n/2,  then  (4) 
subsumes  under  the  general  scheme: 
cot  =  7"1  tan  (rel.  /')  since  cot  y  =  j~l  tan  x  for  any  (x,  y)  e  /'. 

28  It  is  not  unusual  to  write,  e.g.  :  if  /=  1/g,  then  g  =  I//  for  any  two  functions  / 
and  g  (thus  dispensing  with  number  variables).  But,  in  violation  of  automatic 
substitutive  procedures,  the  function  variables  /  and  g  are  replaced,  e.g.,  by  tan  x 
and  cot  x,  and  not  in  the  traditional  literature  by  tan  and  cot. 


470  KARL   MENGER 

Clearly,  w  =  fu  (rel.  II)  implies  u  =  Inv  fw  (rel.  conv.  II)  ;  and 
w  =  fv(rel.  IT)  and  v  =  gu  (rel.  P)  imply  w  =  fgu  (rel.  HP). 

It  is  now  clear  why  functions  have  been  defined  as  on  p.  454,  and  "multi- 
valued" functions  have  been  strictly  excluded.  If  the  latter  were  admitted, 
then,  relative  to  every  pairing,  every  fluent  would  be  a  function  of  every  other 
fluent.  The  question  "Is  w  a  function  of  u  rel.  77?",  which  is  so  important  in 
science  (e.g.,  in  thermodynamics),  would  be  deprived  of  any  meaning.  How- 
ever, for  any  2-placc  function  F,  one  may  define: 

(6)  F(u,  w)  =  0  (rel.  II)  if  and  only  if  (a,  0)eII  implies  F(UOL,  w@)  =  0. 

Of  course  only  if  F(ua,  wp)  ^  0  for  some  (a,  p)E  dom  u  x  dom  w  (especially,  if 
F  ^  0)  does  (6)  establish  a  connection  between  u  and  w. 

The  most  general  connection  of  a  functor  w  with  n  functors  v\,  .  .  .,  vn 
relative  to  P  C  dom  v\  x  ...  X  dom  vn  X  dom  w  by  the  n-place  func- 
tion G  is  given  by: 

w  =  G[VI,  .  .  .  ,  vn]  (rel.  P)  if  and  only  if 

(0i,  .  .  .  ,  pn,  y)  e  P  implies  wy  =  G(vi0i,  .  .  .  ,  Vnpn)> 
The  chain  rule  reads  as  follows  : 

w  —  G[VI,  .  .  .,  Vn\  (rel.  P)  and  vt  =  Fi[uiti,  .  .  .,  ui>m]  (rel.  Hi)  imply 
w  =  G[Fi,  .  .  .,  FJ[«i,i,  .  ..,un.mn]  (rel.  P[lli,  .  .  .,  nw]). 


The  rate  of  change  of  w  with,  say,  vn  rel.  P  (keeping  v±,  .  .  .  ,  vn-\  un- 
changed) is  a  fluent  with  the  domain  P,  which  must  not  be  confused  with 
the  n-th  place  partial  derivative  DnG,  which  is  an  n-place  function  with  a 
domain  C  dom  G.  While  the  two  symbols  are  frequently  misrepresented 
as  synonyms,  the  concepts  are  connected  29  by  the  formula  : 

dw  \ 

-T—  )  =DnG(vlf  ...,»„)  (rel.  P). 

ton'Vl,  -•>  Vn-l 

But  it  is  important  to  note  that  the  rate  of  change  of  w  with  vn  rel.  P  may 
well  exist  without  w  being  a  function  of  v\t  .  .  .,  vn  rel.  P.  An  analogous 
distinction  is  necessary  between  the  cumulation  of  w  with  vn  and  the  n-th 
place  partial  integral  of  G. 

From  the  preceding  exposition  of  the  material,  based  on  explicit 
definitions,  there  emerge  the  outlines  of  its  axiomatic  treatment.  A  group 

29.  Cf.  Calculus,  Chapter  XI,  especially  pp.  306-315  and  332-341  and  Menger  [16]. 


AN  AXIOMATIC  THEORY  OF  FUNCTIONS  AND  FLUENTS        471 

I  of  postulates  has  to  be  devoted  to  partial  order  in  a  realm  of  undefined 
entities  (called  n-ary  relations)  in  which  there  are  two  operations,  inter- 
section and  Cartesian  multiplication,  subject  to  postulates  of  group  II. 
In  terms  of  these  operations,  associative  substitutions  are  introduced 
(group  III).  Union  of  relations  plays  a  small  role,  if  any,  and  certainly 
none  in  that  important  subclass  of  relations  whose  elements  are  called 
classes  of  consistent  pairs  (group  IV),  because  in  the  realm  of  c.'s  c.p. 
union  cannot  in  general  be  defined.  Of  particular  significance  among 
c.'s  c.p.  are  selector  and  identity  relations  (group  V)  which,  as  has  been 
illustrated  in  the  realm  of  the  1 -place  functions,  play  the  roles  of  domains 
of  c.'s  c.p.  At  this  point,  the  class  of  all  real  numbers  (or,  if  one  pleases, 
a  field  or  ring)  enters  the  picture.  By  means  of  it,  consistent  classes  of 
quantities  or  functors  can  be  singled  out  (group  VI)  and,  among  them, 
functions  constituting  a  hypergroupoid.  Selector  relations  that  are  func- 
tions are  the  all-important  selector  functions,  including  the  identity 
function  /.  What  precedes  is  a  basis  for  treating  the  connection  of  one 
functor  with  n  other  functors  by  means  of  an  n-place  function  relative  to 
an  n  -f  1-ary  relation  between  their  domains,  as  well  as  a  functional 
interrelation  of  m  functors  relative  to  an  m-ary  relation. 

Clearly,  such  an  axiomatic  theory  represents  the  most  general  treat- 
ment of  models  in  the  sense  in  which  this  term  is  used  in  science,  especially, 
in  social  sciences.  An  analogy  appears  with  postulational  geometry,  which 
deals  with  undefined  elements,  called  points  and  lines  for  the  sake  of  a 
suggestive  terminology,  while  all  that  is  assumed  about  them  is  that  they 
satisfy  certain  assumptions.  Subsequently,  they  are  compared  with  ob- 
servable objects,  e.g.,  in  the  astronomical  space,  with  cross  hairs  and  light 
rays.  Models  are  formulated  in  terms  of  functor  variables  —  undefined 
classes  of  consistent  quantities,  called,  say,  time  and  position  or  pressure 
and  volume  and  denoted  by  t  and  s  or  p  and  v,  for  the  sake  of  a  suggestive 
terminology,  while  all  that  is  assumed  about  them  is  that,  relative  to 
undefined  pairings  of  their  domains,  those  functors  are  interrelated  by 
certain  functions.  Subsequently,  an  observer  A  compares  them  with 
observed  fluents  (t&  and  SA  or  p\  and  VA)  relative  to  specified  pairings  of 
the  domains  of  the  latter.  He  trusts  that,  within  certain  limits  of  accuracy, 
the  statements  concerning  the  undefined  functors  in  the  model  will  be 
verified  by  known  connections  between  the  observed  fluents  —  some  of 
them,  he  hopes,  by  previously  unknown  connections  30. 

30  The  ideas  here  outlined  seem  to  supplement  the  existing  theory  on  concept 
formation  in  empirical  science;  cf.  Carnap  [2]  and  Hempel  [5]. 


472  KARL   MENGER 

As  far  as  the  general  theory  of  fluents  is  concerned,  the  prediction  may 
be  ventured  that  indiscriminate  uses  of  the  term  "variable"  and  of 
nondescript  letters  x  will  give  way  to  more  careful  distinctions;  and  that 
references  to  domains  of  fluents  as  well  as  to  pairings  of  those  domains, 
once  introduced,  will  be  permanently  incorporated  in  the  articulate 
formulations  of  scientific  laws. 

Acknowledgements 

The  author  is  grateful  to  Professors  M.  A.  McKiernan,  B.  Schweizer,  and  A. 
Sklar  for  valuable  suggestions  in  connection  with  this  paper,  and  to  the  Carnegie 
Corporation  of  New  York  for  making  it  possible  to  devote  time  to  the  development 
of  the  material. 


Bibliography 

[1]    ARTIN,  E.,  Calculus  and  Analytic  Geometry.  Charlottesville  1957,  126  pp. 

[2]    CARNAP,  R.,  The  methodogical  character  of  theoretical  concepts.  In  Minnesota 

Studies  in  Philosophy  of  Science,  vol.  1,  Minneapolis  1956. 
[3]    COURANT,  R.,  Differential  and  Integral  Calculus,  vol.  1, 
[4]    HELLER,  I. ,  On  generalized  polynomials.  Reports  of  a  Mathematical  Colloquium 

2nd.  ser.,  issue  8  (1947),  pp.  58-60. 
[5]    HEMPEL,  C.  G.,  Fundamentals  of  concept  formation  in  the  empirical  sciences. 

International  Encyclopedia  of  Unified  Science,  vol.  2  no.  7  Chicago  1952. 
[6]    MCKIERNAN,  M.  A.,  Les  series  d'iterateurs  et  leurs  applications  aux  equations 
fonctionelles.  Comptes  Rcndus  Paris,  vol.  246  (1958),  pp.  2331-2334. 

[7]    ,   Le  prolongement  analytique  des  series  d'iterateurs.   Comptes   Rendus 

Paris,  vol.  246  (1958),  pp.  2564-2567. 

[8]    McKiNSEY,  J.  C.  C.,  Postulates  for  the  calculus  of  binary  relations.  Journal  of 
Symbolic  Logic,  vol.  5  (1940),  pp.  85-97. 

[9] ,  SUGAR,  A.  C.  and  P.  SUPPES,  Axiomatic  Foundations  of  classical  particle 

mechanics.  Journal  of  Rational  Mechanics  and  Analysis,  vol.  2  (1953),  pp. 
253-272. 
[10]    MENGER,  K.,  Calculus.  A  Modern  Approach.  Boston  1955,  XVIII  -f  354  pp. 

[1 1]    ,  The  ideas  of  variable  and  function.  Proceedings  of  the  National  Academy, 

U.S.A.,  vol.  39  (1953)  pp.  956-961. 

[12] ,  New  approach  to  teaching  intermediate  mathematics.  Science,  vol.   127 

(1958)  pp.  1320-1323. 

[13] t  Algebra  of  Analysis.  Notre  Dame  Mathematical  Lectures,  vol.  3,  1944. 

50  pp. 

[14] ,   Tri-operational  algebra.  Reports  of  a  Mathematical  Colloquium,  2nd 

series,  issue  5-6  (1945)  pp.  3-10  and  issue  7  (1946)  pp.  46-60. 

[15]    ,  Calculus.  A  Modern  Approach.  Mimeographed  Edition,  Chicago  1952, 

XXV  +  255  pp. 


AN  AXIOMATIC  THEORY  OF  FUNCTIONS  AND  FLUENTS         473 

[16]    ,  Rates  of  change   and   derivatives.     Fundamenta    Mathematicae,    vol. 

46  (1958),  pp.  89-102. 
[17]    ,  Random  variables  and  the  general  theory  of  variables.  Proceedings  of  the 

3rd  Berkeley  Symposium  on  Mathematical  Statistics  and  Probability,  vol.  2, 

Berkeley  1956,  pp.  215-229. 

[18]    MILGRAM,  A.  N.,  Saturated  polynomials.  Reports  of  a  Mathematical  Collo- 
quium, 2nd  series,  issue  7  (1946),  pp.  65-67. 
[19]    NOBAUER,  W.,  Ober  die  Operation  des  Einsetzens  in  Polynomnngen.  Mathe- 

matische  Annalen  vol.  134  (1958)  pp.  248-259. 
[20]    RUSSELL,  B.,  The  Principles  of  Mathematics.  Vol.  1.  Cambridge  1903,  XXIX 

+  534  pp. 

[21]    TARSKI,  A.,  Introduction  to  Logic.  New  York  1941,  XVIII  -f  239  pp. 
[22]    ,  On  the  calculus  of  relations.  Journal  of  Symbolic  Logic,  vol.  6  (1941)  pp. 

73-89. 


Symposium  on  the  Axiomatic  Method 


AXIOMATICS  AND  THE  DEVELOPMENT  OF  CREATIVE  TALENT 

R.   L.  WILDER 

University  of  Michigan,  Ann  Arbor,  Michigan,  U.S.A. 

Introduction.  Perhaps  I  should  apologize  for  presenting  here  a  paper 
that  embodies  no  new  results  of  research  in  axiomatics.  However,  for  some 
time  I  have  felt  that  someone  should  record  a  description  of  an  important 
method  of  teaching  based  on  the  axiomatic  method,  and  this  conference 
seems  an  appropriate  place  for  it. 

Actually,  I  can  point  to  an  excellent  precedent  in  that  the  late  E.  H. 
Moore  devoted  most  of  his  retiring  address  [2],  as  president  of  the  Amer- 
ican Mathematical  Society,  to  a  study  of  the  role  of  the  then  rapidly 
developing  abstract  character  of  pure  mathematics,  especially  the  in- 
creasing use  of  axiomatics,  in  the  teaching  of  mathematics  in  the  primary 
and  secondary  schools.  Just  how  much  influence  E.  H.  Moore's  ideas  had 
on  the  later  developments  in  elementary  mathematical  education  in  this 
country,  I  do  not  know.  It  is  perhaps  significant  that  the  increasing 
concern  with  these  matters  on  the  part  of  a  large  section  of  the  member- 
ship of  the  American  Mathematical  Society  (particularly  in  the  Chicago 
Section)  led,  several  years  later,  to  the  forming  of  a  new  organization, 
the  Mathematical  Association  of  America,  whose  special  concern  was  with 
the  teaching  of  mathematics  in  the  undergraduate  colleges.  3 

Historical  Development  of  the  Method.  We  have  heard  a  great  deal,  the 
past  fifty  years  or  so,  of  the  use  of  the  axiomatic  method  as  a  tool  for 
research.  Indeed,  this  use  of  the  method  has  been  justly  considered  as  one 
of  the  most  outstanding  and  surprising  phenomena  in  the  evolution  of 
modern  mathematics.  Scarcely  a  half  century  ago,  so  great  a  mathe- 
matician as  Poincare  could  devote,  in  an  article  entitled  The  Future 
of  Mathematics  [6],  less  than  half  a  page  to  the  axiomatic  method.  And 
although  conceding  the  brilliance  of  Hilbert's  use  of  the  method,  he 
predicted  that  the  problem  of  providing  axiomatic  foundations  for 
various  fields  of  mathematics  would  be  very  "restricted",  and  that  "there 
would  be  nothing  more  to  do  when  the  inventory  should  be  ended,  which 

1  S.ee  [1],  parts  VII  and  XV  6,  but  especially  p.  81  and  p.  146. 

474 


AXIOMATICS  AND  THE  DEVELOPMENT  OF  CREATIVE  TALENT       475 

could  not  take  long.  But  when",  he  continued,  "we  shall  have  enumerated 
all,  there  will  be  many  ways  of  classifying  all;  a  good  librarian  always 
finds  something  to  do,  and  each  new  classification  will  be  instructive  for 
the  philosopher." 

As  recently  as  1931,  Hermann  Weyl  matched  the  contempt  veiled  in 
these  remarks  by  a  fear  expressed  as  follows:  "—I  should  not  pass  over  in 
silence  the  fact  that  today  the  feeling  among  mathematicians  is  beginning 
to  spread  that  the  fertility  of  these  abstracting  methods  [as  embodied  in 
axiomatics]  is  approaching  exhaustion.  The  case  is  this:  that  all  these 
nice  general  notions  do  not  fall  into  our  laps  by  themselves.  But  definite 
concrete  problems  were  conquered  in  their  undivided  complexity,  single- 
handed  by  brute  force,  so  to  speak.  Only  afterwards  the  axiomaticians 
came  along  and  stated:  Instead  of  breaking  in  the  door  with  all  your 
might  and  bruising  your  hands,  you  should  have  constructed  such  and 
such  a  key  of  skill,  and  by  it  you  would  have  been  able  to  open  the  door 
quite  smoothly.  But  they  can  construct  the  key  only  because  they  are 
able,  after  the  breaking  in  was  successful,  to  study  the  lock  from  within 
and  without.  Before  you  can  generalize,  formalize  and  axiomatize,  there 
must  be  a  mathematical  substance.  I  think  that  the  mathematical 
substance  in  the  formalizing  of  which  we  have  trained  ourselves  during 
the  last  decades,  becomes  gradually  exhausted.  And  so  I  foresee  that  the 
generation  now  rising  will  have  a  hard  time  in  mathematics."  2 

Evidently  mathematical  genius  does  not  correlate  well  with  the  gift  of 
prophecy,  since  neither  Poincare's  disdain  nor  Weyl's  fears  have  been 
justified.  Neither  of  these  eminent  gentlemen  seems  to  have  realized  that 
a  powerful  creative  tool  was  being  developed  in  the  new  uses  of  the 
axiomatic  method.  It  was  Weyl's  good  fortune  to  live  to  see  and  ac- 
knowledge the  triumphs  of  the  method.  And  undoubtedly  had  Poincarc 
lived  to  observe  how  the  method  contributed  to  the  progress  of  mathe- 
matics, he  would  gladly  have  admitted  his  prophetical  shortcomings.  It  is 
easy  to  comprehend  why  they  felt  as  they  did,  and  as,  conceivably,  a 
majority  of  their  colleagues  felt.  For  until  quite  recent  years,  the  method 
had  achieved  its  most  notable  successes  in  geometry,  where  axiom 
systems  often  served  as  suitable  embalming  devices  in  which  to  wrap  up 
theories  already  worked  out  and  in  a  stage  of  decline.  The  value  of  the 
method  as  a  tool  for  opening  up  vast  new  domains  for  mathematical 

2  Quoted  from  H.  Weyl  [7].  It  is  to  Weyl's  credit  that  he  acknowledges,  in  this 
connection,  the  brilliant  results  obtained  by  Emmy  Noether  by  her  pioneering  use 
of  the  axiomatic  method  in  algebra. 


476  R.    L.    WILDER 

investigation,  as  it  has  done  in  algebra  and  topology  for  example,  was  not 
yet  sufficiently  exemplified  to  make  an  impression  on  the  mathematical 
public.  Peano's  fundamental  researches  in  logic  and  number  theory  were 
concealed  in  his  unique  "pasigraphy" ;  and  besides,  was  not  this  again  a 
case  of  wrapping  old  facts  in  new  dress  (mused  the  uncomprehending 
analyst)  ?  Similarly  Grassmann's  earlier  work  in  his  (now  justly  ap- 
preciated) Ausdehnungslehre  was  concealed  in  a  mass  of  philosophical 
obscurities,  and  moreover  the  philosophy  of  the  time  was  dominated  by  a 
Kantian  intuitionism  not  receptive  to  the  idea  of  mathematics  as  a 
science  of  formal  structures. 

Nevertheless,  the  evolution  of  modern  mathematics  was  proceeding  in 
a  direction  which  made  inevitable  those  uses  of  axiomatics  with  which 
every  modern  mathematician  is  now  familiar.  Noone,  among  the  mathe- 
maticians active  around  the  turn  of  the  century,  appears  to  have  been 
more  aware  of  this  trend  than  the  American  mathematician  E.  H.  Moore. 
Moore's  interest  in,  and  use  of,  axiomatic  procedures  is  well  known,  and  I 
have  already  remarked  on  his  interest  in  the  influence  which  they  might 
have  on  the  teaching  of  elementary  mathematics.  Of  importance  for  my 
purposes  is  the  influence  of  his  ideas  on  a  group  of  young  mathematicians 
who  were  under  his  tutelage  at  the  time,  particularly  R.  L.  Moore  and 
O.  Veblen.  Both  Veblen  and  R.  L.  Moore  wrote  their  doctoral  dissertations 
in  the  axiomatic  foundations  of  geometry.  And  the  interests  of  both  soon 
turned  to  what  was  at  the  time  a  new  branch  of  geometry  in  which  metric 
ideas  play  no  official  role,  viz.  topology,  or  as  it  was  then  called,  analysis 
situs. 

It  is  an  interesting  fact,  however,  that  the  topological  interests  of  the 
two  diverged,  the  one,  Veblen,  following  the  line  initiated  by  Poincare 
and  subsequently  called  "combinatorial  topology",  the  other,  R.  L. 
Moore,  following  the  line  stemming  from  the  work  of  Cantor  and  Schoen- 
flies  and  subsequently  called  "set-theoretic  topology".  And  whereas  the 
latter,  set-theoretic  topology,  lent  itself  naturally  to  the  axiomatic 
approach  which  Moore  continued  to  develop,  the  former,  combinatorial 
topology,  was  not  left  by  Poincare  (whose  feelings  toward  the  axiomatic 
method  we  have  already  indicated  above)  in  a  form  suitable  to  axiomatic 
development. 

The  first  major  work  of  R.  L.  Moore  in  "analysis  situs"  [3],  was  publish- 
ed in  1916.  3  It  embodied  a  set  of  axioms  characterizing  the  analysis 

3  There  are  three  axiom  systems  given  in  this  paper.  In  our  remarks  we  refer 
only  to.  that  one  which  is  designated  in  [3]  by  the  symbol  "2V. 


AXIOMATICS  AND  THE  DEVELOPMENT  OF  CREATIVE  TALENT        477 

situs  of  the  euclidean  plane.  In  a  later  paper  [4],  Moore  showed  this 
axiom  system  to  be  categorical,  and  still  later  [5]  applied  it  in  a  way 
prophetic  of  the  new,  creative  uses  of  the  axiomatic  method  soon  to 
come  into  vogue. 

However,  of  much  more  importance  for  my  present  purposes,  was  the 
manner  in  which  Moore  4  used  his  axiom  system  for  plane  analysis  situs 
for  discovering  and  developing  creative  talent.  Those  of  us  who  are 
accustomed  to  the  use  of  axioms  in  constructing  new  theories,  or  for  other 
technical  creative  purposes,  may  have  lost  sight  of  the  fact  that  the 
axiomatic  method  can  serve  as  the  basis  for  a  most  useful  teaching 
device. 

I  am  not  referring  to  the  traditional  use  of  axioms  in  teaching  high 
school  geometry  of  the  euclidean  type.  Although  here,  in  the  hands  of  an 
inspired  teacher,  the  method  can  and  sometimes  undoubtedly  does  turn 
up  potential  mathematicans,  most  of  the  teaching  of  high  school  geometry 
seems  to  be  of  two  kinds.  Either  it  is  based  on  the  use  of  a  standard  text 
book  in  which  the  theorems  are  all  worked  out  in  detail  for  study  by  the 
pupil,  with  a  supply  of  minor  problems  —  so-called  "originals"  —  to  be 
done  by  the  pupil  and  geared  usually  to  the  ability  of  the  "average" 
student ;  or  it  is  carried  out  in  connection  with  a  laboratory  process  which 
is  supposed  to  exemplify  the  so-called  "reality"  of  the  theorems  proved, 
therely  preventing  the  abstract  character  of  the  system  from  becoming 
too  dominant.  In  short,  the  whole  process  may  be  considered  overly 
adapted  to  the  capacities  of  the  "average"  student  and  consequently 
generally  loses  —  perhaps  justifiably  —  its  potentiality  for  developing 
the  mathematical  talents  of  the  more  gifted  student. 

Nor  am  I  referring  to  the  fact  that  quite  commonly,  in  our  graduate 
courses  in  algebra  and  topology,  we  use  the  axiomatic  method  for  setting 
up  abstract  systems.  I  mean  something  more  than  this.  What  I  mean  can 
perhaps  be  indicated  by  a  remark  which  one  of  my  former  students  made 
to  me  in  a  recent  letter:  "I  am  having  quite  good  success  teaching  a 
course,  called  Foundations  of  Analysis,  by  the  Moore-Socrates  method." 
The  use  by  Moore  of  the  axioms  for  plane  analysis  situs  in  his  teaching  had 
many  elements  in  common  with  the  Socratic  method  as  revealed  in  the 
"Dialogues",  especially  in  the  general  type  of  interplay  between  master 
and  pupil. 

Moore  proceeded  thusly:  He  set  up  a  course  which  he  called  "Foun- 
dations of  Mathematics",  and  admitted  to  attendance  in  the  course  only 

4  From  here  on,  by  "Moore"  I  shall  mean  R.  L.  Moore. 


478  R.    L.    WILDER 

such  students  as  he  considered  mature  enough  and  sufficiently  sympathetic 
with  the  aims  of  the  course  to  profit  thereby.  It  was  not,  then,  a  required 
course,  nor  was  it  open  to  any  and  all  students  who  wanted  to  "learn 
something  about"  Moore's  work.  He  based  his  selection  of  students, 
from  those  applying  for  admission,  on  either  previous  contacts  (usually 
in  prior  courses)  or  (in  the  case  of  students  newly  arrived  on  the  campus) 
on  analysis  via  personal  interview  —  usually  the  former  (that  is,  previous 
contacts).  The  amazing  success  of  the  course  was  no  doubt  in  some 
measure  due  to  this  selection  process. 

He  started  the  course  with  an  informal  lecture  in  which  he  supplied 
some  explanation  of  the  role  to  be  played  by  the  undefined  terms  and 
axioms.  But  he  gave  very  little  intuitive  material  —  in  fact  only  meagre 
indication  of  what  "point"  and  "region"  (the  undefined  terms)  might 
refer  to  in  the  possible  interpretations  of  the  axioms.  He  might  take  a 
piece  of  paper,  tear  off  a  small  section,  and  remark  "Maybe  that's  a 
region".  However,  as  the  course  progressed,  more  intuitive  material  was 
introduced,  oftentimes  by  means  of  figures  or  designs  set  up  by  the 
students  themselves. 

The  axioms  were  eight  5  in  number,  but  of  these  he  gave  only  two  or 
three  to  start  with;  enough  to  prove  the  first  few  theorems.  The  re- 
maining axioms  would  be  introduced  as  their  need  became  evident.  He 
also  stated,  without  proof,  the  first  few  theorems,  and  asked  the  class  to 
prepare  proofs  of  them  for  the  next  session. 

In  the  second  meeting  of  the  class  the  fun  usually  began.  A  proof  of 
Theorem  1  would  be  called  for  by  asking  for  volunteers.  If  a  valid  proof 
was  given,  another  proof  different  from  the  first  might  be  offered.  In  any 
case,  the  chances  were  favorable  that  in  the  course  of  demonstrating  one 
of  the  theorems  that  had  been  assigned,  someone  would  use  faulty  logic 
or  appeal  to  a  hastily  built-up  intuition  that  was  not  substantiated  by  the 
axioms. 

I  shall  not  bore  you  with  all  the  details;  you  can  use  your  imaginations, 
if  you  will,  regarding  the  subsequent  course  of  events.  Suffice  it  to  say 
that  the  course  continued  to  run  in  this  way,  with  Moore  supplying 
theorems  (and  further  axioms  as  needed)  and  the  class  supplying  proofs. 
I  could  give  you  many  interesting  —  and  amusing  —  accounts  of  the 
byplay  between  teacher  and  students,  as  well  as  between  the  students 
themselves;  good-natured  "heckling"  was  encouraged.  However,  the 
point  to  be  emphasized  is  that  Moore  put  the  students  entirely  on  their  own 

5  One  of  these  was  later  shown  (by  the  present  author  [8])  not  to  be  independent. 


AXIOMATICS  AND  THE  DEVELOPMENT  OF  CREATIVE  TALENT        479 

resources  so  far  as  supplying  proofs  was  concerned.  Moreover,  there  was  no 
attempt  to  cater  to  the  capacities  of  the  "average"  student;  rather  was 
the  pace  set  by  the  most  talented  in  the  class. 

Now  I  grant  that  there  seems  to  be  nothing  sensational  about  this. 
Surely  others  have  independently  initiated  some  such  scheme  of  teaching.6 
The  noteworthy  fact  about  Moore's  work  is  that  he  began  finding  the 
capacity  for  mathematical  creativeness  where  no  one  suspected  it  ex- 
isted! In  short,  he  found  and  developed  creative  talent.  I  think  there  is  no 
question  but  that  this  was  in  large  measure  due  to  the  fact  that  the 
student  felt  that  he  was  being  "let  in"  on  the  management  and  handling  of 
the  material.  He  was  afforded  a  chance  to  experience  the  thrill  of  creating 
mathematical  concepts  and  to  glimpse  the  inherent  beauty  of  mathematics, 
without  having  any  of  the  rigor  omitted  in  order  to  ease  the  process.  And 
in  their  turn,  when  they  went  forth  to  become  teachers,  these  students 
later  used  a  similar  scheme.  True,  they  met  with  varying  success  —  after 
all,  a  pedagogical  system,  no  matter  how  well  conceived,  must  be  operated 
by  a  good  teacher.  Their  success  was  striking  enough,  however,  that  one 
began  to  hear  comments  and  queries  about  the  "Moore  method".  And  it 
is  partly  in  response  to  these  that  I  am  talking  about  the  subject  today.  It 
seems  that  it  is  time  someone  described  the  method  as  it  really  operated, 
and  perhaps  thereby  cleared  up  some  of  the  folklore  concerning  it. 

Description  of  the  Method.  In  the  interest  of  clarity,  I  shall  arrange  my 
remarks  with  reference  to  certain  items  which  I  think,  after  analyzing 
the  method,  are  in  considerable  measure  basic  to  its  success.  These  items 
are  as  follows: 

1.  Selection  of  students  capable  (as  much  as  one  can  tell  from  personal 
contacts  or  history)  of  coping  with  the  type  of  material  to  be  studied. 

2.  Control  of  the  size  of  the  group  participating ;  from  four  to  eight  students 
probably  the  best  number. 

3.  Injection  of  the  proper  amount  of  intuitive  material,  as  an  aid  in  the 
construction  of  proofs. 

4.  Insistence  on  rigorous  proof,  by  the  students  themselves,  in  accordance 
with  the  ideal  type  of  axiomatic  development. 

5.  Encouragement  of  a  good-natured  competition]  it  can  happen  that  as 
many  different  proofs  of  a  theorem  will  be  given  as  there  are  students  in 
the  class. 

6  Professor  A.  Tarski  informed  me  after  the  reading  of  this  paper  that  he  had  used 
a  somewhat  analogous  method  in  one  of  his  courses  at  the  University  of  Warsaw. 


480  R.    L.    WILDER 

6.  Emphasis  on  method,  not  on  subject  matter.  The  amount  of  subject 
matter  covered  varies  with  the  size  of  class  and  the  quality  of  the  indi- 
vidual students. 

I  think  these  six  items  lie  at  the  heart  of  the  method.  Of  course  they 
slight  the  details;  e.g.,  the  manner  in  which  Moore  exploited  the  compe- 
tition between  students,  and  the  way  in  which  he  would  encourage  a 
student  who  seemed  to  have  the  germ  of  an  idea,  or  put  to  silence  one  who 
loudly  proclaimed  the  possession  of  an  idea  which  upon  examination 
proved  vacuous.  I  imagine  that  it  was  in  such  things  as  these  that  Moore 
most  resembled  Socrates.  But  these  are  matters  closely  related  to  Moore's 
personality  and  capability  as  a  teacher,  so  I  shall  confine  myself  to  the 
six  points  enumerated  above  so  far  as  the  description  of  the  method  is 
concerned.  They  are,  I  realize,  themselves  pedagogic  in  nature,  but  more 
of  the  nature  of  what  I  might  call  axiomatic  pedagogy.  They  constitute,  I 
believe,  a  guide  to  the  successful  use  of  axiomatics  in  the  development  of 
creative  talent. 

I  would  like  to  comment  further  on  them : 

1 .  Selection  of  students  capable  of  coping  with  the  type  of  material  to  be 
studied.  I  have  already  made  some  remarks  in  this  regard.  I  pointed  out 
that  Moore  based  his  judgment  regarding  maturity  either  on  his  ex- 
perience with  the  student  in  prior  courses,  or  on  personal  interviews.  I 
might  add,  parenthetically,  that  as  the  years  went  by  and  his  students 
began  to  use  his  methods  in  their  own  teaching,  a  sort  of  code  developed 
between  them  whereby  one  of  the  "cognoscenti"  would  apprise  one  of  his 
colleagues  in  another  university  of  the  availability  of  potentially  creative 
material.  For  example,  the  "pons  asinorum"  of  Moore's  original  axiom 
system  was  "Theorem  15".  If  one  of  Moore's  graduates  wished  to  place  a 
student  for  further  work  under  the  tutelage  of  another  of  Moore's  students 
at  a  different  institution,  and  could  include  in  his  recommendation  the 
statement,  "He  proved  Theorem  15",  then  this  became  a  virtual  "open 
sesame". 

But  Moore,  himself,  was  not  dependent  on  other  institutions ;  he  found 
his  students,  generally  speaking,  in  the  student  body  of  the  University 
of  Texas.  He  had  a  singular  ability  for  detecting  talent  among  under- 
graduates, and  often  set  his  sights  on  a  man  long  before  he  was  ready  for 
graduate  work.  Indeed,  in  some  instances,  he  would  allow  in  his  class  in 
"Foundations"  an  undergraduate  whom  he  deemed  ready  for  creative 
work.  For  Moore  believed  that  a  man  should  start  his  creative  work  as 


AXIOMATICS  AND  THE  DEVELOPMENT  OF  CREATIVE  TALENT       481 

soon  as  possible,  and  the  younger  the  better.  He  reasoned  that  one  could 
always  pick  up  "breadth"  as  he  progressed.  It  was  not  unusual  for  him  to 
discover  talent  in  his  calculus  classes.  And  once  he  suspected  a  man  of 
having  a  potentially  mathematical  mind,  he  marked  that  man  for  the  rest 
of  the  course  as  one  with  whom  he  would  cross  his  foils,  so  to  speak.  By  the 
end  of  the  term,  he  was  usually  pretty  sure  of  his  opinion  of  the  man. 

Of  course  he  could  not,  in  the  very  nature  of  the  case,  always  be  certain. 
This  applies  especially  to  those  who  entered  his  class  as  graduate  students 
from  other  institutions,  who  had  had  no  previous  work  with  him,  and 
whom  he  had  to  screen  usually  in  a  single  interview  at  registration  time. 
And  when  a  student  of  little  or  no  talent  did  slip  by,  he  was  doomed  to  a 
semester  of  either  sitting  and  listening  (usually  with  little  comprehension), 
or  to  feverishly  taking  notes  which  he  hoped  to  be  able  to  understand  by 
reading  outside  of  class.  In  the  latter  case  he  was  often  disappointed,  for 
as  we  all  know,  one's  first  proof  of  a  theorem  is  usually  not  elegant,  to 
understate  the  case,  and  the  first  proofs  of  a  theorem  given  in  class  were 
likely  to  be  of  this  kind.  But  as  I  stated  before,  the  aim  of  the  course  was 
not  so  much  to  give  certain  material  —  the  student  who  wished  the  latter 
would  have  been  better  advised  to  read  a  book  or  to  seek  out  the  original 
material  in  journals.  I  would  call  these  "note-takers"  the  "casualties"  of 
the  course.  So  you  see  it  was  humane,  as  well  as  good  strategy,  to  allow 
only  the  "fit"  to  enroll  in  the  course. 

I  might  remark,  too,  that  those  of  us  who  went  from  Texas  to  other 
institutions  as  young  instructors,  did  not  usually  find  it  possible  to 
institute  Moore's  "exclusion  policy"  in  all  its  rigor.  For  various  reasons, 
we  often  had  to  throw  our  courses  open  to  one  and  all.  This  naturally 
led  to  certain  modifications,  as,  for  instance,  making  sure  that  the  "note- 
takers"  ultimately  secured  an  elegant  proof;  this  seemed  the  least  that 
they  were  entitled  to  under  a  system  where  they  were  not  sufficiently 
forewarned  of  what  to  expect,  and  of  especial  importance  if  the  material 
covered  was  to  be  used  by  the  student  as  basic  information  in  later 
courses. 

2.  Control  of  the  size  of  the  group  participating]  from  four  to  eight 
probably  the  best  Dumber.  This  is  obviously  not  independent  of  1.,  since 
Moore's  method  of  selecting  students  was  clearly  suited  to  keeping  down 
the  size  of  the  class.  Some  of  us,  however,  especially  during  periods  of  high 
enrollments,  have  had  to  cope  with  classes  of  as  many  as  30  students  or 
more.  I  can  report  from  experience  that  even  with  a  class  this  large,  the 


482  R.    L.    WILDER 

method  can  be  used.  Or  course  inevitably  a  few  (sometimes  only  two  or 
three)  students  "star  in  the  production*'.  I  have  found,  however,  that 
these  "star"  students  often  profited  from  having  such  a  large  audience 
as  was  afforded  by  the  "non-active"  portion  of  the  class.  Often  the  "non- 
stars"  came  up  with  some  good  questions  and  sometimes  —  rarely  to  be 
sure  —  with  a  suggestion  that  led  to  startling  consequences. 

In  short,  although  from  four  to  eight  is  the  ideal  size  of  class  for  the  use 
of  the  axiomatic  method,  it  is  not  impossible  to  handle  classes  of  as  many 
as  30  while  using  the  method. 

3.  Injection  of  the  proper  amount  of  intuitive  material,  as  an  aid  in  the 
construction  of  proofs.  This,  I  hardly  need  emphasize,  must  be  handled 
carefully.  With  no  intuitive  background  at  all,  the  student  has  little 
upon  which  to  fix  his  imagination.  The  undefined  terms  and  the  axioms 
become  truly  meaningless,  and  a  mental  block  perhaps  ensues.  Here  the 
instructor  must  exercise  real  ingenuity,  striving  to  furnish  that  amount  of 
intuitive  sense  that  will  be  sufficient  to  suggest  processes  of  proof,  while 
at  the  same  time  holding  the  student  to  the  axiomatic  basis  as  a  founda- 
tion for  all  assertions  of  the  proof. 

I  have  always  been  interested,  in  my  use  of  the  axiomatic  method  in 
Topology,  in  observing  the  degree  to  which  the  various  students  used 
figures  in  giving  a  demonstration.  Some  relied  heavily  on  figures;  others 
used  none  at  all,  being  content  to  set  down  the  successive  formulae  of  the 
proof.  I  have  noticed  that  the  former  type  of  student  usually  developed 
an  interest  in  the  geometric  aspects  of  the  subject,  following  the  tradition 
of  classical  topology,  while  the  latter  developed  greater  interest  in  the  new 
algebraic  aspects  of  the  subject.  There  may  be  considerable  truth  in  the 
old  folklore  that  some  are  naturally  geometric-minded,  while  others  have 
not  so  much  geometric  sense  but  show  great  facility  for  algebraic  types  of 
thinking.  I  don't  know  of  any  better  way  to  discover  a  student's  pro- 
pensities in  these  regards,  than  to  give  him  a  course  in  modern  topology  on 
axiomatic  lines. 

4.  Insistence  on  rigorous  proof,  by  the  students  themselves,  in  accordance 
with  the  ideal  type  of  axiomatic  development.  I  want  to  emphasize  here  two 
advantages  that  the  axiomatic  development  offers. 

In  the  first  place,  I  have  seen  the  method  rescue  potentially  creative 
mathematicians  from  oblivion.  Without  knowing  the  reason  therefor, 
they  had  become  discouraged  and  depressed,  having  taken  course  after 


AXIOMATICS  AND  THE  DEVELOPMENT  OF  CREATIVE  TALENT       483 

course  without  "catching  on"  —  with  no  spark  of  enlightenment.  The 
reason  for  this  was  evidently  that  their  innate  desire  for  clearcut  under- 
standing and  rigor  was  continually  starved  in  course  after  course.  One  can 
appreciate  the  gleam  in  a  student's  eye  when,  provided  with  the  type  of 
rigor  which  the  axiomatic  method  affords,  he  finds  his  mathematical  self 
at  last ;  for  the  first  time,  seemingly,  he  can  let  his  creative  powers  soar 
with  a  feeling  of  security.  This  is  truly  one  of  the  ways  in  which  creative 
talent  is  discovered. 

In  the  second  place,  even  the  average  student  feels  happy  about  knowing 
just  what  he  is  allowed  to  assume,  and  in  the  feeling  that  at  last  what  he  is 
doing  has,  in  his  eyes,  an  almost  perfect  degree  of  validity.  I  can  illustrate 
by  an  example  here.  I  was  once  giving  a  course  in  the  structure  of  the  real 
number  system,  using  a  system  of  axioms  and  the  "Moore  method".  In 
the  class  was  a  man  who  had  virtually  completed  his  graduate  work  and 
was  writing  his  dissertation  in  the  field  of  analytic  functions.  At  the  end  of 
the  course  he  came  to  me  and  said,  "You  know,  I  feel  now  for  the  first 
time  in  my  life,  that  I  really  understand  the  theory  of  real  functions".  I 
knew  what  had  happened  to  him.  Despite  all  his  courses  and  reading  in 
function  theory,  he  had  never  felt  quite  at  ease  in  the  domain  of  real 
numbers.  Now  he  felt  that,  having  been  thrown  wholly  on  his  own 
resourses,  he  had  come  to  grips  with  the  most  fundamental  properties  of 
the  real  number  system,  and  could,  so  to  speak,  "look  a  set  ot  real  num- 
bers in  the  eye!" 

5.  Encouragement  of  a  good-natured  competition.  I  have  found  that  an 
interesting  by-play  often  developed  between  students,  either  to  see  who 
could  first  obtain  the  proof  of  a  theorem,  or  failing  that,  who  could  give 
the  most  elegant  proof.  I  presume  this  is  a  foretaste  of  the  situation  in 
which  the  seasoned  mathematician  often  finds  himself.  I  hardly  need  to 
cite  historic  instances  to  an  audience  like  this;  instances  in  which  a 
settlement  of  a  long  outstanding  problem  was  clearly  in  the  offing,  and 
the  experts  were  vying  with  one  another  to  see  who  would  be  the  first  to 
achieve  the  solution.  This  always  adds  zest  to  the  game  of  mathematics, 
either  on  the  elementary  level  or  on  the  professional  level.  And  no  system 
of  teaching  lends  itself  better  to  this  sort  of  thing  than  the  one  I  am 
discussing. 

There  is  also  the  possibility  that  an  original-minded  student  will 
discover  a  new  and  more  elegant  proof  of  a  classical  theorem.  I  have  had 
this  happen  several  times,  and  on  at  least  one  occasion,  to  which  I  shall 


484  R.    L.    WILDER 

refer  again  below,  the  proof  given  failed  to  use  one  of  the  conditions  stated 
in  the  classical  hypothesis,  so  that  a  new  and  stronger  theorem  resulted. 

6.  Emphasis  on  method,  not  on  subject  matter.  When  one  lectures,  or 
uses  a  text,  the  student  is  frequently  presented  with  a  theorem  and  then 
given  its  proof  before  he  has  had  time  to  digest  the  full  meaning  of  the 
theorem.  And  by  the  time  he  has  struggled  through  the  proof  presented, 
he  has  been  utterly  prejudiced  in  favor  of  the  methods  used.  They  are  all 
that  will  occur  to  him,  as  a  rule.  Use  of  the  axiomatic  method  with  the 
student  providing  his  own  proof,  forces  an  acquaintance  with  the  meaning 
of  the  theorem,  and  a  decision  on  a  method  of  proof.  I  have  continually 
in  my  classes,  whenever  existence  proofs  were  demanded,  urged  the 
students  to  find  constructive  methods  whenever  possible.  In  this  way,  I 
have  had  presented  to  me  constructive  proofs  in  instances  where  I  did  not 
theretofore  know  that  such  proofs  could  be  given. 

In  short,  use  of  the  axiomatic  method  not  only  encourages  the  student 
to  develop  his  own  creative  powers,  but  sometimes  leads  to  the  invention 
of  new  methods  not  previously  conceived. 

There  is  one  other  feature  of  the  method  as  Moore  used  it  that  I  have 
omitted  above,  for  various  reasons,  chiefly  because  of  the  vagueness  of  its 
terms  and  the  debatability  of  any  interpretation  of  it: 

7.  Selection  of  material  best  suited  to  the  method.  It  is  probably  wisest  to 
select  certain  special  subjects  which  seem  best  suited  for  the  avowed 
purposes  of  discovering  and  developing  creative  ability.  For  example,  one 
might  select  material  that  presupposes  little  in  the  way  of  special  tech- 
niques (as,  for  instance,  the  techniques  of  classical  analysis),  but  that  does 
require  that  ability  to  think  abstractly  which  should  be  a  characteristic 
of  the  mature  student  of  mathematics,  and  which  requires  little  intuitive 
background.  The  material  which  Moore  chose  was  of  this  nature ;  another 
such  selection  might  be  the  theory  of  the  linear  continuum. 

In  the  case  of  the  material  which  Moore  selected,  the  student  was  led 
quickly  to  the  frontiers  of  knowledge ;  that  is,  to  the  point  where  he  might 
soon  be  doing  original  research.  I  think  this  aspect  of  his  method  is  not, 
however,  essential  to  its  success  in  developing  creative  talent.  As  Moore 
used  the  method,  the  line  between  what  was  known  and  what  was  un- 
known was  not  revealed  to  the  students.  Customarily  they  were  not 
apprised  of  the  source  of  the  axioms  or  the  theorems;  for  all  they  knew, 
these  had  probably  never  been  published.  And  he  could  go  on  with  them 


AXIOMATICS  AND  THE  DEVELOPMENT  OF  CREATIVE  TALENT       485 

to  unsolved  problems  through  the  device  of  continuing  to  state  theorems 
whose  validity  he  might  not  himself  have  settled,  without  their  ever  being 
aware  of  the  fact. 

Consequently,  so  far  as  this  item  7  is  concerned,  I  would  say  that  the 
important  aspect  of  it  is  the  selection  of  material  requiring  little  intuitive 
background  and  presupposing  mathematical  maturity  but  little  tech- 
nique. The  techniques  of  deduction,  proof,  and  of  discovering  new  theo- 
rems are  naturally  part  of  the  design  of  the  course ;  the  axiomatic  method 
is  ideal  for  the  development  of  these,  and  they  should  be  given  priority 
over  the  quantity  of  material  covered. 

The  justification  for  the  system  is  of  course  its  success.  It  soon  reveals 
to  both  teacher  and  student  whether  or  not  the  latter  possesses  mathe- 
matical talent.  It  quickly  selects  those  who  have  the  "gift",  so  to  speak, 
and  develops  their  creative  powers  in  a  way  that  no  other  method  ever 
succeeded  in  doing.  Every  mathematician,  now  and  in  the  past,  has 
recognized  the  necessity  for  doing  mathematics,  not  just  reading  it,  and 
has  assigned  "originals"  for  the  student  to  do  on  his  own.  In  the  Moore 
system,  we  find  the  "original"  par  excellence  —  there  is  nothing  in  the 
course  but  originals!  I  should  repeat,  in  connection  with  these  remarks, 
that  it  is  not  unusual  for  a  student  to  find  a  new  proof  of  a  known  theorem 
that  deserves  publication,  as  well  as  for  new  theorems  to  be  found.  I  had 
one  outstanding  case  of  this  in  my  own  use  of  the  method,  where  the  new 
proof  showed  one  could  dispense  with  part  of  the  traditional  hypothesis ; 
and  I  had  the  student  go  on  to  incorporate  his  methods  into  proving 
another  and  similar  theorem  which  was  historically  related  to  the  former 
and  was  susceptible  to  the  same  improvement  in  its  hypothesis. 

The  fact  that  what  the  logician  would  call  the  "naive"  axiomatic 
method  is  used,  does  not  seem  to  cause  any  objection  from  the  student. 
In  fact,  I  am  afraid  that  a  strict  formalism  might  not  work  so  well; 
although  this  is  debatable,  and  certainly  a  carefully  formulated  proof 
theory  would  be  quite  adaptable  to  certain  types  of  material.  The  use  of  a 
"natural"  language  throughout,  except  for  the  technical  undefined  terms, 
was,  however,  an  important  feature  of  the  method  as  Moore  used  it,  not 
only  aiding  the  intuition  but  enabling  that  competition  mentioned  in 
item  5  to  "wax  hot"  at  crucial  moments. 

This  brings  me  to  some  remarks  about  an  area  of  teaching  in  which 
tradition  is  most  strong,  namely  the  undergraduate  curriculum.  Today  we 
hear  a  great  deal  about  encouraging  the  young  student  to  go  into  a 


486  R.   L.    WILDER 

mathematical  or  scientific  career.  Unfortunately  much  potential  creative 
talent  is  lost  to  mathematics  early  in  the  undergraduate  training,  and 
much  of  this,  I  am  sure,  is  due  to  traditional  modes  of  presentation.  It  is 
possible  that  the  axiomatic  approach  offers  at  least  a  partial  solution  of 
this  problem. 

The  axiomatic  method  in  the  undergraduate  course.  As  Moore 
used  the  axiomatic  method  for  teaching  on  the  graduate  level,  the  aim  was 
to  discover  and  develop  creative  ability.  Is  there  not  a  possibility  that  the 
method  could  be  employed  to  advantage  at  a  lower  level,  so  that  the 
potentially  creative  mathematician  will  be  encouraged  to  continue  in 
mathematics  to  the  point  where  his  talents  can  be  more  decisively  put  to 
the  test? 

I  am  convinced  that  one  o{  our  greatest  errors  in  the  United  States 
educational  system  has  been  to  underestimate  the  ability  of  the  young 
student  to  think  abstractly.  Moreover,  I  am  convinced  that  as  a  result, 
we  actually  force  him  to  think  "realistically"  where  actually  he  would 
prefer  to  think  abstractly,  so  that  by  the  time  he  begins  graduate  work, 
his  ability  to  abstract  has  been  so  dulled  that  we  have  to  try  to  develop 
it  anew. 

It  seems  probable  that  we  could  try  using  the  axiomatic  method  on  a 
lower  level,  perhaps  even  on  the  freshman  level,  at  selected  points  where 
the  material  is  of  a  suitable  kind.  In  the  interests  of  caution,  perhaps  we 
should  experiment  on  picked  groups  first,  as  well  as  with  carefully 
selected  material.  It  is  possible  that  we  might  light  creative  sparks  where, 
with  the  conventional  type  of  teaching,  no  light  would  ever  dawn.  Some 
years  ago  I  had  a  chance  to  do  this  sort  of  thing,  with  a  picked  group  of 
around  ten  students.  I  did  not  have  an  opportunity  to  teach  most  of  these 
students  again  until  they  became  graduates.  But  I  am  happy  to  state 
that  a  majority  of  them  went  on  to  the  doctorate  —  not  necessarily  in 
mathematics,  for  some  turned  to  physics  -—  but  at  least  they  went  on  into 
creative  work.  I  don't  wish  to  give  myself  credit  here;  it  is  the  method  that 
deserves  the  credit.  These  men  discovered  unsuspected  powers  in  them- 
selves, and  could  not  resist  cultivating  and  exercising  them  further. 
Moreover,  I  found  they  were  delighted  at  being  able  to  establish  their 
ideas  on  a  rigorous  basis.  For  example,  in  starting  the  calculus,  I  gave 
them  precise  definitions,  etc.,  for  a  foundation  of  the  theory  of  limits  in 
the  real  number  system,  and  let  them  establish  rigorously  on  this  foun- 
dation all  the  properties  of  limits  needed  in  the  calculus.  The  result  was 


AXIOMATICS  AND  THE  DEVELOPMENT  OF  CREATIVE  TALENT       487 

that  they  covered  the  calculus  in  about  half  the  time  ordinarily  required . 
Admittedly  some  of  this  saving  in  time  was  due  to  the  select  nature  of  the 
class,  but  a  major  part,  I  am  convinced,  was  due  to  the  confidence  and 
interest  induced  by  establishing  the  theory  of  limits  on  a  firm  basis. 

In  the  presidential  address  of  E.  H.  Moore  to  which  I  referred  in  my 
introduction,  he  stressed  the  advisability  of  mixing  the  real  and  the 
abstract  in  the  teaching  of  mathematics  in  the  secondary  schools.  But 
(and  here  I  quote  from  E.  H.  Moore's  address,  p.  416)  "  —  when  it  comes 
to  the  beginning  of  the  more  formal  deductive  geometry  why  should  not 
the  students  be  directed  each  for  himself  to  set  forth  a  body  of  geometric 
fundamental  principles,  on  which  he  would  proceed  to  erect  his  geomet- 
rical edifice?  This  method  would  be  thoroughly  practical  and  at  the 
same  time  thoroughly  scientific.  The  various  students  would  have  differ- 
ent systems  of  axioms,  and  the  discussion  thus  arising  naturally  would 
make  clearer  in  the  minds  of  all  precisely  what  are  the  functions  of  the 
axioms  in  the  theory  of  geometry."  Here  was  evidently  a  suggestion  for 
the  creative  use  of  axiomatics  at  the  high  school  level. 

There  are  currently  experiments  being  conducted  in  some  under- 
graduate colleges  which  are  based  on  modifications  of  the  methods  Moore 
used.  For  example,  I  know  of  one  case  7  where  a  special  course  of  this 
kind,  for  freshmen,  has  been  devised.  One-half  the  course  is  spent  esta- 
blishing arithmetic,  on  an  axiomatic  basis.  The  numbers  0,  1,2,  etc.  are 
used,  but  the  development  is  rigorous,  and  indeed  approaches  the  rigor 
of  a  formal  system  in  that  the  ndes  for  proof  are  explicitly  set  forth.  By 
the  use  of  variables,  the  student  is  led  gradually  into  algebra,  which 
occupies  most  of  the  latter  half  of  the  course.  The  course  terminates  in  an 
analysis,  based  on  truth  tables,  of  the  formal  logic  to  which  the  student 
has  gradually  become  accustomed  during  the  course.  I  judge  that  one  of 
the  reasons  for  the  success  which  the  course  seems  to  have  achieved  is 
that  the  student  is  made  aware  of  the  reasons  for  the  various  arithmetic 
manipulations  in  which  he  was  disciplined  in  the  elementary  schools ;  as, 
for  instance,  why  one  inverts  and  multiplies  in  order  to  divide  by  a 
fraction.  This  course  has,  incidentally,  revealed  that  students  who  do  not 
do  well  on  their  placement  examinations  are  not  necessarily  laggards, 
weak-minded,  or  susceptible  of  any  of  the  other  easy  explanations,  but 
that  they  often  are  intelligent,  capable  persons  who  have  been  antagonized 
by  traditional  drill  methods.  Moreover,  some  of  these  students  are  induced 
by  the  course  into  going  further  in  mathematics.  I  believe  this  course  is 

7  At  the  University  of  Miami. 


488  R.    L.    WILDER 

still  in  a  developmental  stage,  and  I  await  with  interest  reports  on  its 
effectiveness.  One  gets  the  feeling  from  reading  the  text  used  that  the 
student  is  being  treated  with  trust,  as  naturally  curious  to  know  the  why 
of  what  he  is  doing,  and  as  being  intelligent  enough  to  find  out  if  permitted ! 
During  the  past  few  years  there  has  been  published  a  number  of  ele- 
mentary texts  which  use  the  axiomatic  method  to  some  extent.  Perhaps 
this  is  a  sign  of  a  trend.  I  hope  that  in  my  remarks  I  have  not  over- 
emphasized to  such  an  extent  as  to  give  an  impression  that  I  think  the 
axiomatic  method  is  a  cure-all.  I  do  not  think  so.  Nor  do  I  think  it 
desirable  that  all  courses  should  be  axiomatized!  But  I  believe  that  the 
great  advances  that  the  method  has  made  in  mathematical  research 
during  the  past  50  years  can,  to  a  considerable  extent,  find  a  parallel  in 
the  teaching  of  mathematics,  and  that  its  wise  and  strategic  use,  at  special 
times  along  the  line  from  elementary  teaching  to  the  first  contacts  with 
the  frontiers  of  mathematics,  will  result  in  the  discovery  and  development 
of  much  creative  talent  that  is  now  lost  to  mathematics. 


Bibliography 

[1]  ARCHIBALD,  R.  C.,  A  semicentennial  history  of  the  American  Mathematical  So- 
ciety 1888-1938.  American  Mathematical  Society  Semicentennial  Publications, 
vol.  1,  New  York  1938,  V  -f  262  pp. 

[2]  MOORE.  E.  H.,  On  the  foundations  of  mathematics.  Bulletin  of  the  American 
Mathematical  Society,  vol.  9  (1902-03),  pp.  402-424. 

[3]  MOORE,  R.  L.,  On  the  foundations  of  plane  analysis  situs.  Transactions  of  the 
American  Mathematical  Society,  vol.  17  (1916),  pp.  131-164. 

[4]  ,  Concerning  a  set  of  postulates  for  plane  analysis  situs.  Transactions  of  the 

American  Mathematical  Society,  vol.  20  (1919),  pp.  169-178. 

[5]  1  Concerning  upper  semi-continuous  collections  of  continua.  Transactions  of 

the  American  Mathematical  Society,  vol.  27  (1925),  pp.  416-428. 

[6]    POINCARE,  H.,  The  foundations  of  science.  Lancaster,  Pa.,  1946,  XI  -f  553  pp. 

[7]    WEYL,  H.,  Emmy  Noether.  Scripta  Mathematica,  vol.  3  (1935),  pp.  1-20. 

[8]  WILDER,  R.  L.,  Concerning  R.  L.  Moore's  axioms  Z\  for  plane  analysis  situs. 
Bulletin  of  the  American  Mathematical  Society,  vol.  34  (1928),  pp.  752-760.